Multicast concentrators

ABSTRACT

Broadband switching including the implementation of and control over a massive sub-microsecond switching fabric. To effect the attributes of the switching fabric, conditionally nonblocking components are used a building-blocks in an interconnection network which is recursively constructed. The properties of the interconnection network are preserved during each recursion to thereby configure the massive switching fabric from scalable circuitry.

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application is a non-provisional application of provisionalapplication Ser. No. 60/212,333 filed Jun. 16, 2000.

BACKGROUND OF THE DISCLOSURE

[0002] 1. Field of the Invention

[0003] This invention relates generally to broadband switching and, moreparticularly, to the design of the sub-microsecond switching and controlover a massive broadband switching network.

[0004] 2. Description of the Background Art

[0005] As telecommunication systems have evolved, the demand forbandwidth has been ever increasing in both transmission and switching.Advances in fiber optics afford ample transmission capacity, whileswitching—the technology that puts transmission capacity to flexibleuse—has not kept pace. Because the scale of a switching fabric issubject to various constraints (e.g., electronic or physical), a largeswitch is often constructed from the networking of smaller ones. Thus,for example, the public switched telephone network is an interconnectionof numerous switch offices; likewise, the core of the modern digitalswitching system is typically a multi-stage network of smaller switches.Most important, in this modern era of broadband communications,countless primitive switching units inside a single chip are integratedinto a large switch. Massive integration of switching components hasbeen a fertile area of research and exploratory development efforts.

[0006] The results of such efforts are generally ad hoc in nature,without rigorous underpinnings; such underpinnings, when uncovered, leadto general elucidating principles and, accordingly, more efficientimplementations of switching networks follow naturally from theprinciples. In this way, known but specific industrial designs and/orcommercial applications are understood as merely special cases of abroad array of cases. From another viewpoint, sporadic findings in theliterature translate into instances of different special cases of thegeneral principles.

[0007] By way of a heuristic example of the benefit of uncoveringfoundational principles, a switching network at a microscopic level isfirst considered to illustrate the foregoing observations. It is knownin the art that efficacious control over a packet switching networkcomposed of nodes is effected whenever the switching decision at eachnode is determined only by information carried in each local input datapacket to the node; such a control mechanism is called “self-routing”.The concept of “self-routing” was initially disclosed by D. H. Lawrie inan article entitled “Access and alignment of data in an arrayprocessor,” as published in IEEE Trans. Comp., vol. 24, pp. 1145-1155,1975. Lawrie postulated the following in-band control mechanism for aspecific banyan-type network (called the Omega network) composed of acascade of stages wherein each stage is further composed of a number oftwo-input/two-output switching cells: upon entering the network, a datapacket composed of a sequence of bits is prepended with its binarydestination address in the form d₁d₂ . . . d_(n). The bit d_(j)indicates the preference between the only two outputs of a stage-jswitching cell and is consumed by the stage-j switching control. Thus,the switching state of a cell is determined by just this leading bit ofeach of the two input packets. The existing self-route mechanism used inthis particular banyan-type network considered by Lawrie is ad hoc, thatis, determination of the routing tag of a packet is one oftrial-and-error. The main reason behind the trial-and-error procedure isthat Lawrie has not had the benefit of a fundamental theoreticalapproach to determine the routing tag for self-routing, as covered inthe sequel by the inventive subject matter in accordance with thepresent invention. The theoretical underpinnings are founded upon theconcept of “guide of a bit-permuting network”, which is a sequence ofnumbers, whereby the guide ensures that the routing tag for any givenbit-permuting network can be determined once the guide of that networkis computed. As will be shown, the guide of the networks studied byLawrie happens to be a special case wherein the guide is themonotonically increasing 1, 2, . . . , n. The destination address can nolonger be used as the routing tag for any other banyan-type networkwhose guide is not monotonically increasing. For this reason, thosebanyan-type networks whose routing tag “seems not related” to thedestination address have not been widely studied. But, ironically, thosewidely studied networks, including the Omega network studied by Lawrie,are actually the most anti-optimal ones with regard to the layoutcomplexity under the popular “2-layer Manhattan model with reservedlayers” among a huge family of equivalent networks.

[0008] The issues of equivalence among networks and optimization oflayout complexity brings up a second example highlighting theshortcomings of the past methods. If all those widely studied networksare not optimal, then what networks are optimal and can used to replacethe widely studied ones or how to construct such optimal networks in asystematic way need to be explored. The present invention addressesthese problems.

[0009] All banyan-type networks are equivalent in a weak sense, but insome applications only equivalent networks in a stronger sense can bedeployed in replacement of each. A related example of the shortcomingsof the existing art is the lack of a systematic way for the adaptationof one network into an equivalent of another in strong senses.

[0010] A fourth motivating example, which considers a switching networkat a macroscopic level, relates to the properties of a switching networkitself. The component complexity of an N×N nonblocking network is atleast N²/4 (Here the definition of a nonblocking network requires thenetwork to be unique-routing to begin with, because otherwise there aredifferent senses for a network to be “nonblocking”.) The quadratic orderin this bound indicates the intrinsically high complexity in thenonblocking property of the network. So instead of applying anonblocking network in switch design, the focus is on uncovering simplenetworks that preserves “conditionally nonblocking properties” ofswitches and thereby construct large conditionally nonblocking switchesout of small ones in an economical way. Recursive applications of suchconstruction then leads to conditionally nonblocking switches ofindefinitely large sizes. Such theoretical recursive property thenallows the physical construction of switching fabric at a throughputlevel much higher than that of existing routers/switches by thecontemporary ASIC technology. In the literature, there are individualinstances of certain conditionally nonblocking switches constructed byswitching networks, such as the one disclosed by A. Huang and S. Knauerin an article entitled “Starlite: a wideband digital switch,” aspublished in Proceedings of Globecom '84, Atlanta, pp. 121-125, 1984.However, these instances of conditionally nonblocking property are notpreserved by simple network and hence do not enjoy the advantage ofrecursive construction.

[0011] Banyan-type networks as recursive applications of 2-stageinterconnection or, at least, equivalent to such recursive applications.In contrast with 3-stage alternate-routing switching that is popular intelephony, 2-stage switching network is more compact in nature andthereby facilitates the VLSI implementation of massive recursiveapplication. More importantly, the unique-routing nature of 2-stageswitching is more compatible with sub-microsecond control inside abroadband switching chip. A fifth example of deficiency of the existingart is in the systematic method of physical implementation of recursive2-stage interconnection that takes advantage of today's technologies inmaking switching fabrics at a much higher level of throughput than alllargest existent routers.

[0012] The critical problem with 2-stage switching is blocking, and oneway to alleviate the blocking problem is by “statistical line grouping”,which replaces every interconnection line in the network by a bundle oflines and, at the same time, dilates the size of every nodeproportionally. A critical issue in applying the method of statisticalline grouping lies in the choice of the switch to fill the role of adilated node. The selected switch does not have to be a nonblockingswitch but needs some partial nonblocking property that is articulatedin the present invention (Partial nonblocking property is moreeconomically achievable than the full nonblocking property of a switch.)Meanwhile, the control over the selected switch must also be compatiblewith sub-microsecond control inside a broadband switching chip. Ideally,there should be a self-routing mechanism inside the selected switch thatcan be smoothly blended with the self-routing mechanism over thebanyan-type network. A final example highlighting the shortcomings ofthe past methods is the lack of a clearly superior candidate for thisselected switch. The present invention proposes “concentrator” as aperfect candidate. When multicast switching is involved, then a“multicast concentrator” replaces the concentrator.

SUMMARY OF THE INVENTION

[0013] The shortcomings of the prior art, as well as other limitationsand deficiencies, are obviated in accordance with the present inventionby applying algebraic principles to the physical realization of a largeswitching fabric based upon contemporary technologies.

[0014] In accordance with a broad system aspect of the presentinvention, an m-to-n multicast concentrator for routing input signals,each of the input signals being 0-bound, 1-bound, bicast, or idle, theconcentrator includes: (a) m input ports to receive the input signals;(b) m output ports partitioned into two groups wherein m-n of the moutput ports are grouped as a 0-output group and the remaining n outputports are grouped as a 1-output group; and (c) means, responsive to theinput signals, for routing a maximum total number of 0-bound and bicastones of the input signals to the 0-output group and the maximum totalnumber of 1-bound and bicast ones of the input signals to the 1-outputgroup.

[0015] In accordance with a method aspect of the present invention, amethod for implementing an m-to-n multicast concentrator with referenceto the network topology of an m-to-n concentrator, the m-to-nconcentrator having m-n output ports grouped as a 0-output group and noutput ports grouped as a 1-output group and being constructed from amulti-stage interconnection network of sorting cells, includes: (a)constructing a multi-stage interconnection network of nodes having thesame network topology as the multi-stage interconnection network of them-to-n concentrator; and (b) filling each of the nodes of theconstructed network with a bicast cell.

BRIEF DESCRIPTION OF THE DRAWING

[0016] The teachings of the present invention can be readily understoodby considering the following detailed description in conjunction withthe accompanying drawings, in which:

[0017] FIGS. 1A-1H depict eight of the twenty-seven connection states ofa 2×3 circuit element;

[0018] FIGS. 2A-B depict the “bar state” and the “cross state”connection states of a switching cell;

[0019] FIGS. 2C-F depict the four connection states of an expander cell;

[0020]FIG. 3A depicts an exemplary interconnection network with threenodes;

[0021]FIG. 3B depicts the interconnection network of FIG. 3A wherein thenodes of the network are filled with switching cells to constitute aswitch;

[0022]FIG. 4 depicts a route through an interconnection network;

[0023]FIG. 5A depicts an exemplary routable interconnection network;

[0024]FIG. 5B depicts an exemplary switching network wherein the nodesof the network of FIG. 5A are filled with switches, including switchingcells and distributors;

[0025]FIG. 6A depicts a generic M×N k-stage interconnection networkillustrating the layout of such a network;

[0026]FIG. 6B depicts an exemplary 5×4 2-stage interconnection networkconforming to the layout of FIG. 6A;

[0027]FIG. 6C depicts one illustrative manner of prescribing an externalinput/output order on a multi-stage network;

[0028]FIG. 6D depicts one illustrative manner of splitting theprescribed external input/output order for purposes of linking onemulti-stage network to another multi-stage network;

[0029]FIG. 6E depicts the results of the product of two 16×16 exchangesin one order;

[0030]FIG. 6F depicts the results of the product of the same twoexchanges in FIG. 6E but in reverse order;

[0031]FIG. 7 depicts a 16×16 4-stage network as an example of a2^(n)×2^(n) multi-stage network where n=4;

[0032]FIG. 8 depicts an exemplary plain 2-stage interconnection networkwith parameters m=2 and n=8;

[0033]FIG. 9 depicts the linear addressing scheme on an exemplary2-stage interconnection network;

[0034]FIG. 10 depicts the vector addressing scheme on the same exemplary2-stage interconnection network as in FIG. 9;

[0035]FIG. 11A depicts the manner in which a data signal progressesthrough a generic 2-stage interconnection network with an outputexchange;

[0036]FIG. 11B depicts the manner in which a data signal progressesthrough a generic 2-stage interconnection network with an inputexchange;

[0037]FIG. 12 depicts an exemplary 2-stage interconnection with anoutput exchange for a 3×5 2-stage interconnection network;

[0038]FIG. 13 depicts an exemplary 2-stage interconnection with an inputexchange for a 3×5 2-stage interconnection network;

[0039]FIG. 14 depicts the manner in which “basic building block”networks of 2×2, 3×3, and 5×5 are used in an exemplary recursive 2-stageconstruction;

[0040]FIG. 15 depicts the manner of mapping the recursive 2-stageconstruction exemplified by FIG. 14 into a binary tree diagram;

[0041] FIGS. 16-19 depict the manner of building a recursive 2-stageinterconnection with an input exchange from cells;

[0042]FIG. 20 depicts the binary tree associated with the recursiveconstruction depicted in FIGS. 16-19;

[0043]FIG. 21A depicts a (3 2 1) permutation on an 8×8 exchange;

[0044]FIG. 22B depicts a (1 2 3) permutation on an 8×8 exchange;

[0045]FIG. 22C depicts a (3 1) permutation on an 8×8 exchange;

[0046]FIG. 22D depicts a combined (1 4)(2 3) permutation on an 8×8exchange;

[0047]FIG. 22 depicts a network expressed as [id:(4 3 2 1):(1 4 2 3):(34):id]₄;

[0048]FIG. 23 depicts a network expressed as [:(3 2 1):(3 2 1):]₃;

[0049]FIG. 24 depicts a network expressed as [:(3 4):(1 4):(4 3 2 1):]₄which is not routable;

[0050]FIG. 25 depicts a network expressed as [:(2 3):(1 3):(3 2 1):]₃which is one network comprising part of the network of FIG. 24;

[0051]FIG. 26 depicts the same network of FIG. 25 comprising anotherpart of the network of FIG. 24;

[0052]FIG. 27 depicts a graphical manner for obtaining the trace and theguide of the 16×16 banyan-type network [id:(3 4):(1 4):(2 4):id];

[0053]FIG. 28A summarizes the paths of FIG. 27 to generate the trace;

[0054]FIG. 28A summarizes the paths of FIG. 27 to generate the guide;

[0055]FIG. 29 depicts a route through a 16×16 banyan-type network [id:(34):(1 4): (2 4):(4 3 2 1)]₄ from the origination address 1100 to thedestination address 1110;

[0056]FIG. 30A summarizes the paths of FIG. 24 to generate the trace;

[0057]FIG. 30A summarizes the paths of FIG. 24 to generate the guide;

[0058]FIG. 31 depicts the progression of input/output addresses throughthe network of FIG. 24;

[0059]FIG. 32A depicts an exemplary connection request constraintcompliant with the compressor constraint for a 5×5 switch;

[0060]FIG. 32B depicts are ordering of output addresses of the switch ofFIG. 32A which is order preserving;

[0061]FIG. 32C depicts five concurrent connections over a compressorimplemented from a generic switch;

[0062]FIG. 32D is a representation whereby the compressor of FIG. 32C isbent into a cylinder to visualize the order-preservation of thecompressor;

[0063] FIGS. 33A-D shows the six combinations of concurrent connectionsrequired for a 3×3 switch to quality as a compressor;

[0064]FIG. 34 depicts, for a generic switch, multicast connections fromfive input ports to nine output ports that can be concurrentlyaccommodated by an expander which are compliant with the expanderconstraint;

[0065] FIGS. 35A-P depict a 4×4 switch which qualifies as a compressorif and only if it accommodates at least the sixteen combinations ofconcurrent point-to-point connections shown;

[0066] FIGS. 36A-P depict a 4×4 switch which qualifies as a upturnedcompressor if and only if it accommodates at least the sixteencombinations of concurrent point-to-point connections shown;

[0067] FIGS. 37A-P depict a 4×4 switch which qualifies as a UCnonblocking switch if and only if it accommodates at least the sixteencombinations of concurrent point-to-point connections shown;

[0068]FIG. 38A depicts an I/O matching from 10 input ports to 10 outputports which is compliant with the UC-nonblocking constraint and thus canbe accommodated by a 10×10 UC nonblocking switch;

[0069]FIG. 38B depicts an I/O matching from 10 input ports to 10 outputports which is compliant with the CU-nonblocking constraint and thus canbe accommodated by a 10×10 CU nonblocking switch;

[0070]FIG. 39 depicts the relationship among switch attributes that arepreserved under 2X or X2 interconnection;

[0071]FIG. 40 depicts a 15×15 compressor constructed from the 2X versionof a 2Stg(3,5) network by filling in the nodes with any compressors ofappropriate sizes;

[0072]FIG. 41 depicts the manner in which nine conditionally nonblockingproperties of a switch are preserved by two families of networks;

[0073]FIG. 42 depicts a recursive 2X construction from cells which isthe 16×16 reverse banyan network appended with the inverse shuffleexchange;

[0074]FIG. 43 depicts a 16×16 divide-and-conquer network appended withthe swap exchange;

[0075]FIG. 44A depicts an exemplary network wherein stage 2 is to be“scrambled”;

[0076]FIG. 44B depicts the results of scrambling stage 2 of the networkof FIG. 44A;

[0077]FIG. 44C depicts the exchange immediately after stage 2 of thenetwork of FIG. 44A resulting from cell rearrangement;

[0078]FIG. 45 depicts the four senses of equivalence among banyan-typenetworks arranged into a hierarchical diagram;

[0079]FIG. 46 depicts the four senses of equivalence among banyan-typenetworks without I/O exchanges arranged into a hierarchical diagram;

[0080]FIG. 47 depicts the four senses of equivalence among banyan-typenetworks extending to all bit-permuting networks;

[0081]FIG. 48 depicts the four senses of equivalence among bit-permutingnetworks without I/O exchanges;

[0082] FIGS. 49A-E depict all five 4-leaf binary trees;

[0083] FIGS. 50A-E depict the corresponding dimensions of each nodecorresponding to FIGS. 49A-E, respectively, for 2×2 building blocks;

[0084]FIG. 51 depicts the recursive plain 2-stage interconnectionnetwork associated with the balanced tree as the 16×16 network [:(34):(1 3)(2 4):(3 4):];

[0085]FIG. 52 depicts the recursive plain 2-stage interconnectionnetwork associated with the rightist tree as the 16×16 baseline network[:(1 2 3 4):(2 3 4):(3 4):];

[0086]FIG. 53 depicts the recursive 2X interconnection networkassociated with the balanced tree as the 16×16 network [:(3 4):(1 3 24):(3 4):(1 3 2 4)];

[0087]FIG. 54 depicts the recursive 2X interconnection networkassociated with the rightist tree as the 16×16 baseline network appendedwith the swap exchange [:(2 3 4): (2 3 4):(3 4):(1 4):(2 3)];

[0088]FIG. 55 depicts the recursive 2X interconnection networkassociated with the leftist tree as the 16×16 reverse banyan networkappended with the inverse shuffle exchange [:(3 4):(2 4):(1 4):(1 2 34)];

[0089]FIG. 57 depicts a 64×64 divide-and-conquer network;

[0090]FIG. 58 depicts the middle exchange X₍₆ ₃₎₍₅ ₂₎₍₄ ₁₎ in the 64×64network of FIG. 57 is equivalent to the array of contact points betweentwo perpendicular stacks of planes wherein each plane carries an 8×8reverse baseline network;

[0091]FIG. 59 depicts a 2^(n)×2^(n) divide-and-conquer networkrecursively constructed as the plain 2-stage tensor product between a2^(┌n/2┐)×2^(┌n/2┐) divide-and-conquer network and a 2^(└n/2┘)×2^(└n/2┘)divide-and-conquer network;

[0092]FIG. 60 depicts the 16×16 divide-swap-conquer network [:(3 4):(14)(2 3):(3 4):];

[0093]FIG. 61 depicts the 64×64 divide-swap-conquer network associatedwith the 6-leaf balanced binary tree of FIG. 56C as [:(5 6):(4 6):(16)(2 5)(3 4):(5 6):(4 6):];

[0094]FIG. 62A depicts a switch employing out-of-band control;

[0095]FIG. 62B depicts that, for an interconnection network of switchingelements forming the switching fabric, each switching element iscontrolled by a control signal from the central control unit through acontrol input port;

[0096]FIG. 63A depicts the in-band control signal composed of at leastone bit prefixing a packet;

[0097]FIG. 63B depicts the in-band control signal for a representativeswitching fabric wherein each switching element determines its ownconnection state according to the control signals of the local inputpackets;

[0098]FIG. 64A depicts a switching cell in a switching network employingout-of-band control;

[0099]FIG. 64B depicts a switching cell in a switching network when thecontrol is by in-band signaling;

[0100]FIG. 65A depicts a high-level block diagram of a generic switchingcell under in-band control;

[0101]FIG. 65B depicts the connection state ({0}, null) for a 2×1multiplexer;

[0102]FIG. 65C depicts the connection state (null, {0}) for a 2×1multiplexer;

[0103]FIG. 65D depicts the connection state when the two input packetsat input-0 and input-1 of a bicast cell are a bicast packet and an idlepacket, respectively;

[0104]FIG. 65E depicts the connection state with an idle packet at0-input and a bicast packet at 1-input of the bicast cell;

[0105]FIG. 66A depicts a packet entering the switching networkillustrating the presence of an activity bit;

[0106]FIG. 66B depicts the format of a generic routing tag of a datapacket entering stage j;

[0107]FIG. 66C depicts 1×1 switching circuitry implemented as a separatedevice appended to the main switching cell and illustrating how therouting tag is changed at various locations in a generic stage j;

[0108]FIG. 66D depicts a packet with the destination address d₁d₂ . . .d_(n) is preceded by the bit pattern 1d_(γ(j))p₁p₂d_(γ(j+1)) . . .d_(γ(n));

[0109] FIGS. 67A-F depicts the adoption of the block diagram of FIG. 65Afor the inclusion of bit consumption and rotation as the bit consumptionproceeds;

[0110]FIG. 68 depicts a partial sorting network;

[0111]FIG. 69 depicts the application of statistical line grouping witha line-bundle size 8 to the 16×16 divide-and conquer network results ina 128×128 network comprising 16×16 nodes;

[0112]FIG. 70A depicts an 8-to-4 concentrator constructed by an 8×8partial sorting network which is a 4-stage interconnection network ofsorting cells;

[0113]FIG. 70B depicts a test run of 2-bit signals through another8-to-4 concentrator which shares the same underlying 8×8 partial sortingnetwork shown in FIG. 70A;

[0114]FIG. 71A depicts a 8-to-4 concentrator depicted in FIG. 70A asadapted into an 8-to-4 multicast concentrator;

[0115]FIG. 71B depicts a test run with the same input packets as in FIG.71A except for certain idle packets;

[0116]FIG. 72A depicts the operation of a multicast concentrator withpriority treatment;

[0117]FIG. 72B depicts the bicasting of packets in accordance with agiven priority scheme;

[0118]FIG. 73A depicts the construction by an orthogonal package;

[0119]FIG. 73B depicts the construction by an interface-board packagewhere all input and output switching elements are Printed CircuitBoards;

[0120]FIG. 74 depicts the construction at the interface-board packagelevel where all input and output switching elements, represented byblocks, are orthogonal packages;

[0121]FIG. 75A depicts a binary tree associated with illustrativeconstruction of a switching fabric from the recursive applications of2-stage interconnection involving the five levels of physicalimplementation, where each internal node of the tree is mapped to one ofthe levels of implementation;

[0122]FIG. 75B shows the same binary tree in FIG. 75A but with its nodesshowing exemplifying dimensions of the building blocks as well as thenetworks constructed at different steps of 2-stage interconnection inthe recursion; and

[0123]FIG. 75C shows the same binary tree in FIG. 75A but with its nodesshowing exemplifying generic components in the physical structure of theswitching fabric.

DETAILED DESCRIPTION

[0124] To fully appreciate the import of the switching circuitry of thepresent invention, as well as to gain an appreciation for the underlyingoperational principles of the present invention, it is instructive tofirst discuss in overview fashion foundational principles pertinent tothe present invention. This overview also serves to introduceterminology so as to facilitate the more detailed description ofillustrative embodiments in accordance with the present invention.

[0125] A. SWITCH AND NETWORK

[0126] 1. Switch and its properties

[0127] Definition A1: “connection state”. Let Inputs denote an array(that is, an ordered set) of m elements and Outputs an array of nelements. A “connection state” from the m-element Inputs array to then-element Outputs array is a sequence (T₀, T₁, T₂, . . . , T_(m−1)) of mpairwise disjoint subsets of the Outputs array. Elements in the arrayInputs and the array Outputs are respectively called “inputs” and“outputs” in the connection state. When k∈=T_(j), the input j is said tobe connected to output k in the connection state.

[0128] The connection state (T₀, T₁, T₂, ... , T_(m−1)) means theconfiguration where each input j is connected to all outputs in T_(j);the set T_(j) may be null. The disjointness among T₀, T₁, T₂, . . . ,T_(m−1) prevents collision of different inputs at an output. The totalnumber of connection states from an array of m-elements to an array ofn-elements is (m+1)^(n).

EXAMPLE 1

[0129] Consider the case of m=2 and n=3. There are a total of 27connection states. Further, for the sake of concreteness but withoutloss of generality, consider that the Inputs array represents the inputsto a circuit element and the Outputs array represents the outputs fromthe circuit element. The two inputs to the circuit element are 0 and 1,that is, Inputs={0,1}; the three outputs from the circuit are 0, 1, and2 or Outputs ={0, 1, 2}. Referring now to FIGS. 1A-1H, eight of thepossible 27 connection states for the circuit element are depicted bothfor illustrative purposes and for eventual use to exemplify laterdefinitions. In particular, for FIG. 1A, the connection state engenderedby connecting input 0 to output 0 and input 1 to output 1 (shown by thedashed lines internal to circuit element 100) is as follows: ({0}, {1}),that is, T₀={0} and T₁={1}. This connection state is referred to as C₀.This connection state as well as the remaining seven connection statesof FIGS. 1B-1H are tabulated as follows:

[0130] C₀=({0}, {1}),

[0131] C₁=({0}, {2}),

[0132] C₂=({1}, {0}),

[0133] C₃=({1}, {2}),

[0134] C₄=({2}, {0}),

[0135] C₅=({2}, {1}),

[0136] C₆=({0, 1, 2}, null), and

[0137] C₇=(null, {0, 1, 2}).

[0138] Definition A2: “point-to-point connection state” and “multicastconnection state”. A connection state T₀, T₁, T₂, . . . , T_(m−1) fromthe array Inputs to the array Outputs is said to be a “point-to-pointconnection state” if every set T_(j) contains at most one element;otherwise, the connection state is called a “multicast connectionstate”.

EXAMPLE 2

[0139] Using the connection states of Example 1, connection states C₀,C₁, . . . , C₅ are point-to-point since every set T_(j) contains at mostone element, whereas connection states C₆ and C₇ are multicast.

[0140] For the case of m=2 and n=3, there are a total of twelvepoint-to-point connection states.

EXAMPLE 3

[0141] Besides the six connection states C₀, . . . , C₅, the remainingsix point-to-point connections states for element 100 in FIG. 1A having2 inputs and 3 outputs are as follows:

[0142] C₈=({0}, null),

[0143] C₉=({1}, null),

[0144] C₁₀=({2}, null),

[0145] C₁₁=(null, {0}),

[0146] C₁₂=(null, {1}), and

[0147] C₁₃=(null, {2}).

[0148] Definition A3: “switch”. A collection of at least two differentconnection states from the input array to the output array is called a“switch” if it has the routing property of a switch—the routing propertystates that for every element j in the array Inputs and every element kin the array Outputs, there is a connection state (T₀, T₁, T₂, . . . ,T_(m−1)) such that k is in the subset T_(j).

[0149] Elements of Inputs and Outputs are respectively called the “inputports” and “output ports” of the switch, or simply “inputs” and“outputs” of the switch when there is no ambiguity. The switch is calledan “m×n” switch when there are m inputs and n outputs.

[0150] It takes at least two different connection states to qualify fora switch because a single connection state can be realized by fixed orhard wiring. The routing property of a switch ensures the connectivityfrom every input to every output.

[0151] The abstract notion of a switch actually refers to a “switchingfabric or device in unidirectional transmission” and is independent ofthe notion of switching control, which will be discussed in the sequel.Moreover, the connection states in the definition map into connectionconfigurations realizable by the switching fabric. Thus, whereas thenotion of connection states may be abstract, the connection states arephysically manifested by actual connection configurations of theswitching fabric.

EXAMPLE 4

[0152] Using the connection states of Example 1, it is possible toconfigure a number of different switches.

[0153] (a) For example, consider the collection of connection states,denoted C_(A), where C_(A)=(C₁, C₂, C₅, C₁₂), and place the connectionstates of C_(A) in the tabular form: Connection State T₀ T₁ C₁ {0} {2}C₂ {1} {0} C₅ {2} {1}  C₁₂ null {1}

[0154] It is clear that each output is present in the column under T₀,and similarly each output is present in column T₁, so the collection ofconnection states in C_(A) define a switch.

[0155] (b) Consider now the collection of states C_(B)=(C₀, C₃, C₄), asfollows: Connection State T₀ T₁ C₀ {0} {1} C₃ {1} {2} C₄ {2} {0}

[0156] Once again each output is present in both columns, so C_(B) isanother switch.

[0157] (c) Consider now the collection of states C_(C)=(C₀, C₃, C₅), asfollows: Connection State T₀ T₁ C₀ {0} {1} C₃ {1} {2} C₅ {2} {1}

[0158] Now, whereas the T₀ has all outputs represented, column T₁ doesnot, so C_(C) is not a switch.

[0159] (d) Consider now the collection of states C_(D)=(C₆, C₇), asfollows: Connection State T₀ T₁ C₆ {0,1,2} null C₇ null {0,1,2}

[0160] Once again each output is present in both columns, so C_(D) isyet another switch.

[0161] Definition A4: “point-to-point switch” and “multicast switch”. Aswitch is a “point-to-point switch” if every connection state composingthe switch is a point-to-point connection state; otherwise, the switchis a “multicast switch”.

EXAMPLE 5

[0162] Switches defined by collections C_(A) and C_(B) of Example 4 arepoint-to-point, whereas C_(D) defines a multicast switch.

[0163] Definition A5: “switching cell”. A “switching cell” is a 2-statepoint-to-point switch, with the connection states, as shown in FIGS. 2Aand 2B, being called the bar state (201) and cross state (202),respectively. In particular, the bar connection state is ({0}, {1}), andthe cross connection state is ({1},{0}).

[0164] Definition A6: “expander cell”. An “expander cell” is a multicastswitch with the four connection states (211, 212, 213, 214) as in shownin FIGS. 2C-2F, respectively, which includes the bar state (211) andcross state (212) of the switching cell. In particular, the connectionstates are: ({0},{1}); ({1},{0}); ({0,1}, null); and (null, {0,1}). Intabular form, the connection states are: Connection State T₀ T₁ {0} {1}{1} {0} {0,1} null null {0,1}

[0165] Notice that the expander cell conforms to the definition ofswitch because each output is present in T₀ and in T₁. Of the fourconnection states, only the bar and cross states are point-to-point.Therefore the expander cell is a multicast switch.

[0166] Switching cells and expander cells are extensively used in therecursive construction of networks, as discussed later.

[0167] Definition A7: “accommodation of a combination of concurrent I/Oconnections by a switch”. A connection state (T₀, T₁, T₂, . . . ,T_(m−1)) of an m×n switch is said to “achieve” the I/O connection frominput i to output k if k∈=T_(i). Consider the combination of concurrentI/O connections from inputs I₁, I₂, I₃, . . . to distinct outputs O₁,O₂, O₃, . . . , respectively. A switch is said to “accommodate” thiscombination of concurrent I/O connections if there exists a connectionstate of the switch that achieves every I/O connection in thecombination, i.e., the connection from input I_(j) to output O_(j) forevery index j.

EXAMPLE 6

[0168] The combination of concurrent I/O connections for a 3×3 switchcan be input 0 connected to output 2 and input 1 connected to output 0.Then, if the switch has any connection state that can achieve each ofthe two connections concurrently, then the switch is said to“accommodate” this combination. One qualified connection state can be({2},{0}, Null); another qualified connection state is ({1,2},{0},Null).

[0169] Note that a connection state is an intrinsic characteristic of aswitch, which is a legitimate connection configuration of the switch,while a combination of I/O connections in the above definition can beregarded as an arbitrary request made on a switch, which can be from anyparticular set of inputs to any set of distinct outputs. So being arequest, a combination of I/O connections may not always be accommodatedby the switch. For example, the connection from an input to more thanone output, that is, a multicast connection request, can never beaccommodated by a point-to-point switch.

[0170] On the other hand, when a combination of concurrent connectionsis accommodated by a switch, the I/O connections in the qualifiedconnection state covers, but is not limited to, the combination that isbeing accommodated.

[0171] Definition A8: “nonblocking property of a switch”. An m×n switchis said to be “nonblocking” if, for every sequence of distinct inputsI₀, I₁, . . . , I_(k−1) and every sequence of distinct outputs O₀, O₁, .. . , O_(k−1), where k=min {m,n}, there exists a connection state thatconcurrently connects each I_(j) to O_(j) for all j, 0≦j≦k−1.

[0172] In effect, a nonblocking switch can accommodate every combinationof point-to-point connections between inputs and outputs as one wouldintuitively expect. This definition is an extension of the routingproperty. Notice, too, that this definition does not preclude multicastconnection states from the switch, despite the apparent point-to-pointnature of the definition.

[0173] In the above definition A8, the sequence of distinct inputs I₀,I₁, . . . , I_(k-31 1) may be restricted to be in the increasing orderwithout loss of generality. In the following example we shall imposethis restriction so as to avoid unnecessary duplications in I/Opairings.

EXAMPLE 7

[0174] Again, consider the example of circuit element 100 having 2inputs and 3 outputs. It is known that there are twelve possiblepoint-to-point connections states, namely, C₀, . . . , C₅, and C₈, . . ., C₁₃ in the notation of previous examples. Using the parameters of thedefinition for nonblocking property of a switch, min{m, n=2, so k=2. Fork=2, there is only one sequence of two distinct inputs arranged in theincreasing order, that is, (I₀,I₁)=(0, 1). On the other hand, there aresix sequences of two distinct outputs out of totally three outputs,namely, (0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1).

[0175] Consider the following tabular form: Input Sequence OutputSequence Connection (I₀, I₁) (O₀, O₁) State (0, 1) (0, 1) C₀ (0, 1) (0,2) C₁ (0, 1) (1, 0) C₂ (0, 1) (1, 2) C₃ (0, 1) (2, 0) C₄ (0, 1) (2, 1)C₅

[0176] It is clear from this tabular information that for, everysequence I₀, I₁ of distinct inputs and every sequence O₀, O₁ of distinctoutputs, there exists a connection state that concurrently connects eachI_(j) to O_(j) for all j. The connection states for this illustrativeexample used the six point-to-point connection states C₀, . . . , C₅.

[0177] A major objective of switching theory is to construct sizableswitching fabrics that route data signals from inputs to outputsconcurrently. If the bit rate at every input is λ, then ideally nosingle device in an n-input switching fabric needs to operate at a speedproportional to nλ. In that way the total throughput is not bounded bythe economical feasibility of any single device. The nonblockingproperty of a switch is hence a key issue in point-to-pointcommunications. Ideally no single component of the switching control,including the processor, operates at a speed proportional to nλ either.Even in the presence of a nonblocking switch, it only promises theexistence of a connection state that accommodates a given combination ofpoint-to-point connections. The switching control identifies andactivates the appropriate connection state. This requires proper controlsignaling to all switching elements on the connection path of every datasignal. The switching control also prevents the collision of datasignals from multiple inputs at any point in the switch; switchingcontrol will be discussed in detail in the sequel.

[0178] As discussed in more detail later, but worthwhile to highlight atthis point, is the notion of a “conditionally nonblocking switch”—aconditionally nonblocking switch of any kind may serve as a nonblockingswitch when the input traffic has been preprocessed so as to meet thespecified condition. A “compressor”, a “decompressor”, an “expander”, a“UC nonblocking switch”, etc., as to be defined in the sequel, areconditionally nonblocking switches in a form that enables such elementsto accommodate every combination of concurrent I/O connections subjectto a certain correlation among I/O addresses inside the combination.

[0179] 2. Multi-stage interconnection network and its properties

[0180] A “switching network” composed of nodes involves two independentconcepts. One is the switching at individual nodes; the other is theinterconnection of the nodes. In line with these concepts, it is helpfulto first discuss an “interconnection network” in which every node is asimple box with an array of input terminals (or “input ports” or simply“inputs” when there is no ambiguity) and an array of output terminals(or “output ports” or simply “outputs”) without any concern forconnection states of the box. Then a switching network is formulated asan interconnection network whereby every node is filled by anappropriate switch. In this way, the interconnection of smaller switchescreates a larger switch, whose characteristics depend on both the typeof interconnection of nodes and the attributes of the individualswitches composing the nodes. Thus, there must be a clear conceptualseparation between the attributes of a switch and the type ofnetworking.

[0181] Definition A9: “interconnection network”. An “interconnectionnetwork” is a finite collection of nodes together with a collection ofunidirectional interconnection lines such that:

[0182] (a) every node is an object with an array of inputs and an arrayof outputs;

[0183] (b) an interconnection line leads from an output of one node tothe input of another node; and

[0184] (c) every input/output (I/O) of a node is incident with at mostone interconnection line.

[0185] Anode with m inputs and n outputs is called an m×n node or a nodewith “size” m×n. In particular, a 2×2 node is called a cell.

[0186] Since a node in an interconnection network is characterized by aninput array and an output array, a node can qualify to be a switchthrough the proper specification of connection states between its I/Oarrays.

[0187] Definition A10: “external I/O”, “input node”, and “output node”.An I/O of a node in an interconnection network is called an “externalI/O” if it is not incident with any interconnection line. A nodecontaining an external input of the interconnection network is called an“input node”; similarly, a node containing an external output of theinterconnection network is called an “output node”. An interconnectionnetwork with M external inputs and N external outputs is called an M×Ninterconnection network or a network with a “size” of M×N.

EXAMPLE 1

[0188]FIG. 3A depicts an 3×3 interconnection network 300 with threenodes designated S, T, and U. Nodes S and U are input nodes while nodesT and U are output nodes.

[0189] Definition A11: “route”. A “route” from an external input A of aninterconnection network to an external output B means a chain (a₀, b₀,a₁, b₁, . . . a_(k), b_(k)), k≧0, with the following characteristics:

[0190] (a) for 0≦j≦k, there is a node Z_(j) on which a_(j) is an inputand b_(j) is an output;

[0191] (b) a₀, a₁, . . . , a_(k) are distinct from one another;

[0192] (c) b₀, b₁, . . . , b_(k) are distinct from one another;

[0193] (d) for 0<j≦k, b_(j−1) is interconnected to a_(j); and

[0194] (e) A=a₀ and B=b_(k).

[0195] It should be noted that this definition allows for the traversingof nodes more than once.

EXAMPLE 2

[0196] Interconnection network 400 in FIG. 4 depicts an example for k=2of route 401 from A=a₀ and B=b₂, which are the only input and output,respectively, for network 400.

[0197] Definition A12: “routable”. An interconnection network is“routable” if there is a route from every external input to everyexternal output. For instance, if there are two external inputs A₀ andA₁ and there external outputs B₀, B₁ and B₂, then the network isroutable if there are routes A₀→B₀, A₀→B₁, A₀→B₂, A₁→B₀, A₁→B₁, andA₁→B₂, where A→B is read as “there is a route from A to B.

EXAMPLE 3

[0198] Consider the 3×5 interconnection network 500 of FIG. 5A. It is aroutable interconnection network. In fact, it is easily discernible byfollowing interconnection lines from each external input to eachexternal output.

[0199] Definition A13: “unique-routing network” and “alternate-routingnetwork”. Recall the definition of a route from an external input of aninterconnection network to an external output from Definition A11. Tworoutes (a₀, b₀, a₁, b₁, . . . , a_(k), b_(k)) and (a₀, b₀′, a₁′, b₁′, .. . , a_(k)′, b_(k)) in a network are said to be “parallel” if a_(j) anda_(j)′ reside on the same node for 0<j≦k and both b_(j) and b_(j)′reside on the same node for 0≦j<k.

[0200] A routable interconnection network is said to be “unique routing”if all routes from any given external input to any given external outputare parallel. Otherwise, it is said to be “alternate routing”.

[0201] Note that it is possible for two nonparallel routes to go througha common interconnection line. In the definition of a unique-routingnetwork parallel routes are indistinguishable. This is only practical interms of routing control. Thus even a unique-routing network allows abit of parallelism. The parallelism in a unique-routing network can beseen in, for example, the application of the technique of statisticalline grouping to a network, which will be described in the sequel.

EXAMPLE 4

[0202] The interconnection network 300 in FIG. 3A is analternate-routing network because, beside the direct access from thenode S to the node T, there is indirect access through the node U. Anexample of a unique-routing network is the network 500 as shown in FIG.5A. There are no parallel routes in this network. The numerousbanyan-type networks and all networks constructed from the recursive2-stage construction including generalized version, as will be describedin the sequel, are all of the unique-routing type.

[0203] Definition A14: “external input order”, “external output order”,and “external I/O order”. An “external input order” of aninterconnection network means an ordering on the external inputs of theinterconnection network; similarly, an “external output order” of aninterconnection network means an ordering on the external outputs of theinterconnection network. An “external I/O order” means a combination ofan external input order and an external output order.

[0204] 3. Switching network

[0205] Definition A15: “switching network”. An interconnection networkis called a “switching network” if

[0206] (a) every node qualifies as a switch through proper specificationof connection states;

[0207] (b) the network is routable; and

[0208] (c) an external I/O order of the network is specified.

EXAMPLE 1

[0209] Consider again 3×5 interconnection network 500 of FIG. 5A nowrecast as network 510 in FIG. 5B. Suppose that every node in network 510attains the status of a switch upon the proper specification ofconnection states. For instance, configure nodes 502, 503, and 504 asswitching cells (SC), and nodes 501 and 505 as distributors (DR). (Adistributor is a 1×2 switch defined by the two connection states ({0})and ({1}). With the specification of an external I/O order (e.g., thenatural order (0, 1, 2, . . . ) in the top-down manner for the externalinputs and outputs), network 510 qualifies as a switching network.

[0210] Definition A16: “connection state from external inputs toexternal outputs”. Consider a switching network with the array ExtInputs(respectively or resp. ExtOutputs) of external inputs (resp. externaloutputs). Given a connection state on every node, there corresponds a“connection state from the array of ExtInputs to the array ofExtOutputs” as follows: an external input a₀ is connected to an externaloutput b_(k) in the connection state from the array ExtInputs to thearray ExtOutputs if there exists a route (a₀, b₀, a₁, b₁, . . . , a_(k),b_(k)) in the network such that, for 0≦i≦k, a_(j) is connected to b_(j)by the given connection state in the node that a_(j) and b_(j) resideon.

[0211] Accordingly, every combination of a connection state on everynode in a switching network corresponds to a connection state betweenthe array of external inputs and the array of external outputs; however,this correspondence is not necessarily one-to-one.

EXAMPLE 2

[0212] Suppose each of the nodes S, T, and U in the interconnectionnetwork of FIG. 3A are filled with a switching cell. Also, label theexternal inputs/outputs as 0, 1, and 2 from top down. Such anarrangement is shown as network 310 in FIG. 3B. A total of eightcombinations can be formed by a bar/cross state on each of the threenodes. These eight combinations correspond to six distinct connectionstates between arrays of external I/O, as tabulated below (including twoduplicate pairs indicated by asterisks): Corresponding Connection StateState of S State of T State of U between External I/O Bar Bar Bar({0},{1},{2})* Bar Bar Cross ({0},{2},{1}) Bar Cross Bar ({1},{0},{2})**Bar Cross Cross ({1},{2},{0}) Cross Bar Bar ({1},{0},{2})** Cross BarCross ({2},{0},{1}) Cross Cross Bar ({0},{1},{2})* Cross Cross Cross({2},{1},{0})

[0213] Theorem: “switch”. As stated in the above Definition A15, everycombination of a connection state on every node in a switching networkcorresponds to a connection state between the array of external inputsand the array of external outputs. The collection of all connectionstates from the array of external inputs of a switching network to thearray of external outputs involved in such correspondence constitutes aswitch between arrays of external I/O, that is, the collection satisfiesthe routing property of a switch.

[0214] Definition A17: “switch realization of a switching network”. Theswitch between arrays of external I/O, described in the precedingTheorem, is called the “switch realization of the switching network” orthe “switch constructed from the switching network”.

[0215] The switch constructed from a switching network can be deployedas a node in another network; such recursive construction yieldsindefinitely large switches.

[0216] 4. Switch properties vs. network properties

[0217] It is important to differentiate the properties of a switch andfrom those of a network. A switch has various attributes like“point-to-point switch” and “multicast switch”, and “nonblockingswitch”. These attributes are referred to as switch properties as theirdefinition only depends on the connection states of a switch.

[0218] On the other hand, some concepts are related to a network only.The following items (a)-(f) are related to the inventive subject matter;they will be discussed in detail in the sequel.

[0219] (a) multi-stage network:

[0220] (b) exchanges in multi-stage network;

[0221] (c) plain 2-stage, 2X and X2 interconnection and recursive plain2-stage, 2X and X2 construction;

[0222] (d) bit-permuting exchange, bit-permuting network and banyan-typenetwork;

[0223] (e) trace and guide of a bit-permuting network; and

[0224] (f) equivalence among banyan-type network under cellrearrangement.

[0225] Since a switching network is a routable interconnection networkin which every node is filled by a switch, the nature of a switchconstructed from a switching network is determined by the attributes ofboth the interconnection network and the individual switching nodes.

[0226] Definition A18: “Preservation of a switch property by a network”.Certain types of interconnection of the network nodes may preservecertain switch properties. A switch property is said to be “preserved”by a routable interconnection network if, when each node of theinterconnection network is filled by a switch having this certain switchproperty, the overall realized switch also has this same switchproperty. Recursive application of this type of interconnection thenleads to indefinitely large switches with the same property. Therefore,when a large switch with some desirable properties is to be built, ifthere exists certain types of interconnection which can preserve thesaid switch properties, then, instead of constructing it in one step,which is usually impractical, it can be constructed in recursive stepswherein each step is the proper interconnection of smaller switcheshaving the same desirable properties such that these properties arealways preserved in the recursion.

[0227] 5. Multi-stage interconnection network

[0228] Definition A19: “multi-stage interconnection network”. A“multi-stage interconnection network” (abbreviated “multi-stagenetwork”) is an interconnection network whose nodes are grouped into“stages” such that

[0229] (a) every interconnection line is between two consecutive stages;

[0230] (b) every external input is on a first-stage node;

[0231] (c) every external output is on a final-stage node; and

[0232] (d) nodes within each stage are linearly ordered, starting from0, as node 0, 1, 2, . . .

[0233] When the number of stages is k, the multi-stage network is calleda “k-stage network”. A node in the j^(th) stage is called a “stage-jnode”. An I/O of a stage-j node is called a “stage-j I/O”.

[0234] The graph representation of a multi-stage network is as follows,with the help of FIG. 6A and FIG. 6B. FIG. 6A shows a generic M×Nk-stage network 600 while FIG. 6B shows a 5×4 2-stage network 610 as anexample. As shown in FIG. 6A, the stages of a k-stage network 600 arearranged sequentially in a left-to-right manner by convention andlinearly labeled as stage 1, 2, . . . , j, . . . , and k. All nodes ineach stage are arranged sequentially in a top-to-bottom manner byconvention and linearly labeled as node 0, 1, 2, . . . For example, letR_(j) be the number of nodes in stage j, then the nodes in stage j arelinearly labeled as node 0, 1, 2, . . . , R_(j)−1. According to the“left-in-right-out” convention, all ports on the left-hand-side of anode are the input ports of that node, and all ports on theright-hand-side of a node are the output ports of that node.

[0235] Definition A20: “induced I/O order at each stage”. The I/O portson each node (e.g., 602) are also arranged sequentially in atop-to-bottom manner by convention and linearly labeled as I/O port 0,1, 2, . . . , of that node. In the scope of a stage, all stage-j I/Oports are sequentially arranged by concatenating the I/O ports of allstage-j nodes according to the linear order of the node within the stageso as to form a single array and linearly labeled from top to bottom asI/O port 0, 1, 2, . . . , of stage j. In other words, the linear orderamong stage-j nodes induces a linear order among stage-j I/O byconcatenating the I/O arrays of all stage-j nodes into a single array.This is called the “induced order” on stage-j I/O. The label of an I/Oport in a stage is also called the “address of the I/O port” in thatstage.

[0236] For example, as shown in FIG. 6B, the two inputs (611, 612) onstage-1 node 0 (621) are locally labeled as input 0 and 1 (631), and thethree inputs (613, 614, 615) on stage-1 node 1 (622) are locally labeledas input 0, 1 and 2 (632). Then the induced order on these five stage-1inputs are 0, 1, 2, 3 and 4 (633) as in the scope of the stage.Similarly, the induced orders on the five stage-1 outputs, the fivestage-2 inputs and the four stage-2 outputs are 0, 1, 2, 3, 4 (634), 0,1, 2, 3, 4 (635) and 0, 1, 2, 3 (636), respectively. Note that in graphrepresentation, the labels for the local I/O orders and the induced I/Oorders are usually not shown unless they need to be explicitly referredto. The external inputs of a multi-stage network are the same as stage-Iinputs, and external outputs are the same as final-stage outputs.

[0237] Definition A21: “default external I/O order”. The induced orderof stage-1 inputs and of final-stage outputs of a multi-stage network iscalled the “default system” of an external I/O order, or simply the“default external I/O order”. In other words, in a conventional graphrepresentation, the default external input order and the defaultexternal output order of a M×N multi-stage network, are 0, 1, . . . ,M−1 and 0, 1, . . . , N−1, respectively, in the top-down manner. Forexample, as shown in FIG. 6B, the default external input order 0, 1, 2,3, 4 (637) is the same as the induced order of stage-1 inputs (633) and,similarly, the default external output order 0, 1, 2, 3 (638) is thesame as the induced order of stage-2 (final-stage) outputs (636).

[0238] When an external I/O order on a multi-stage network isprescribed, it may or may not coincide with the default system. In thegraph representation, one way to indicate a prescribed external I/Oorder is by numerical addresses starting from 0 on both sides of themulti-stage network. This is illustrated by the drawing 660 in FIG. 6C.The numerical labeling, however, does not work well in the graphrepresentation when the multi-stage network is to be linked to othernetworks. The preferred representation of external I/O order is to splitthe double identities between an external input and a stage-1 input andalso between an external output and a final-stage output; the splitidentities are then indicated by two separate points interconnected witha straight line. In the conventional graph representation, the stage-1inputs remain attached to stage-1 nodes. Meanwhile, points representingexternal inputs are lined up vertically and placed to the left of thestage-1 nodes. Symmetric arrangement applies to the output side as well.This graph representation of the prescribed external I/O order isillustrated as the network 670 in FIG. 6D as depicted by referencenumerals 681 and 683. Reference numeral 682 shows the interconnectionbetween stage 1 and stage 2.

[0239] 6. Exchanges in the multi-stage network

[0240] For a k-stage network, it is said to be interconnected in thesense that each stage-j output port is connected to a distinctstage-(j+1) input port, for 1≦j<k, by one and only one interconnectionline in a one-to-one manner. This implies that, for any k-stage network,the number of stage j output ports, for 1≦j<k, must be the same as thatof stage-(j+1) input ports.

[0241] Definition A22: “interstage exchange”, “input exchange”, and“output exchange”. The pattern defined by the interconnection linesbetween two consecutive stages of a multi-stage network is called the“interstage exchange” which defines a one-to-one correspondence fromoutputs of the front stage to inputs of the hind stage. For example, inFIG. 6A, the interconnection lines in each column (not specificallydrawn) between any two neighboring stages define an interstage exchange(e.g., 605). Recall that when the prescribed external I/O order of amulti-stage network does not coincide with the default external I/Oorder, the double identities between an external input and a stage-1input and between an external output and a final-stage output are splitinto two separate points which are joined by a straight line. Thestraight lines representing the prescribed external input order form apattern which called the “input exchange”. Similarly, the pattern formedby the straight lines representing the prescribed external output orderis called the “output exchange”. The input and output exchanges areabbreviated as the “I/O exchanges”. Therefore, the input exchange andoutput exchange of a multi-stage network can be regarded as the addressconversions from the prescribed input order to the default externalinput order, and from the prescribed output order to the defaultexternal output order, respectively. Note that in a graph representationof a multi-stage network, there is no difference between the interstageexchanges and I/O exchanges. In the real implementations, however, theinterstage exchanges are realized by the physical wirings while the I/Oexchanges may or may not be. Recall that the I/O exchanges represent theaddress conversions, so they can be virtually implemented by explicitlylabeling each individual I/O port with an address according to theprescribed order or physically implemented by wirings, depending on thesituation.

[0242] Definition A23: “K×K exchange”. Any exchange defines a one-to-onecorrespondence from the points on its left-hand-side to the points onits right-hand-side. When the exchange is connecting K pairs of points,it is called a “K×K exchange”. Since the K points on each of the twosides of the K×K exchange are labeled with the addresses from 0 to K−1,each interconnection line in the exchange maps (or more formally,permutes) an address in the range from 0 to K−1 to another address alsoin the range from 0 to K−1. Thus the K×K exchange can be defined as apermutation of addresses from 0 to K−1. For example, the 2-stage networkshown in FIG. 6D is equipped with the input exchange 681, 0

2, 1

0, 2

3, 3

1, 4

4, and the output exchange 683, 0

2, 1

3, 2

0, 3

1. Meanwhile the interstage exchange 682 is 0

0, 1

2, 2

3, 3

1, 4

4.

[0243] Definition A24: “product of two exchanges”. An K×K exchange X₃ issaid to be the product of two K×K exchanges X₁ and X₂, which is writtenas X₃=X₁X₂, when the permutation due to the exchange X₃ is equivalent tothe combined effect of the sequential application of the permutationsdue to X₁ and then X₂. Note that X₁X₂≠X₂X₁ in general. In graphrepresentation, the product of two exchanges can easily be obtained fromthe two exchanges by replacing each pair of two connected line segments,each from one exchange, with a single straight line. For example, asshown in FIG. 6E, the product of two 16×16 exchanges 691 and 692 is the16×16 exchange 693. The product of the same two exchanges, but inreversed order, that is, the exchange 692 is now in front of 691, asshown in FIG. 6F, results in a different exchange 694.

[0244] The I/O exchanges, together with the interstage exchanges, arecalled the “exchanges in the multi-stage network”. Therefore, there arefour versions of a multi-stage network: with and without an inputexchange and with and without an output exchange. The default version,as shown in FIG. 6A, is without the I/O exchanges. Note that theroutability of a multi-stage network relies only on the interstageexchanges, not the I/O exchanges, since the I/O exchanges do not alterthe intrinsic connectivity of the network.

[0245] For a 2^(n)×2^(n) multi-stage interconnection network, theaddresses of I/O ports can be expressed as n-bit binary numbers. Forexample, FIG. 7 shows a 16×16 4-stage network 700 as an example of2^(n)×2^(n) multi-stage network where n=4. All of the I/O ports of the16×16 4-stage interconnection network 700 are linearly ordered in atop-to-bottom manner with each labeled with a 4-bit binary number.

[0246] A special kind of 2^(n)×2^(n) exchange is called a “bit-permutingexchange” when each of the 2^(n) interconnection lines in the exchangemaps a binary address O₁O₂ . . . O_(n) of an output port in a stage to abinary address I₁I₂ . . . I_(n) of an input port in the next succeedingstage in such a way that each mapping is restricted to be a“bit-permutation” by which O₁O₂ . . . O_(n) and I₁I₂ . . . I_(n) can betransformed to each other by only permuting the positions of the bits,that is, in other words, the numbers of 0's and 1's will not be altered.

[0247] For example, as shown in FIG. 7, the line connecting from theoutput port 701 labeled with the address 0110 to the input port 702 inthe next stage labeled with the address 1100 corresponds to abit-permutation which, in particular, is an 1-bit left-rotation (orequivalently 3-bit right-rotation) of the address 0110 to give theaddress 1100. For another example, the line connecting from the outputport 703 labeled with the address 1010 to the input port 704 in the nextstage labeled with the address 1001 can be regarded as a bit permutationof the binary address defined as: the 1^(st) bit is shifted to the4^(th) place, the 4^(th) bit to 2^(nd) place, the 2^(nd) bit to 3^(rd)place, and the 3^(rd) bit to 1^(st) place.

[0248] Among infinitely many multi-stage networks with different sizes,a class of 2^(n)×2^(n) network is of particular interest when all nodesin the network are 2×2 and every exchange in it is bit-permuting. Suchkind of 2^(n)×2^(n) multi-stage networks are called the “bit-permutingnetworks”. Since a bit-permuting network can be completely determined byspecifying each exchange in it, and each exchange corresponds to aparticular bit permutation on the binary addresses, a bit-permutingnetwork can thus be simply defined by a sequence of bit-permutations,which is particularly useful when analyzing its network properties.Further details about the bit-permuting network will be given in thesequel.

[0249] B. 2-STAGE INTERCONNECTION

[0250] 1. Plain 2-stage interconnection network

[0251] Definition B1: “plain 2-stage interconnection network”. The“plain 2-stage interconnection network with parameter m and n”, denotedas 2Stg(m, n), is composed of n m×m input nodes and m n×n output nodessuch that, for 0≦x<m and 0≦y<n, there is a interconnection line from thex^(th) output of the y^(th) input node to the y^(th) input of the x^(th)output node. This type of construction procedure is referred to as the“plain 2-stage interconnection”. The interconnection lines form theinterstage exchange. There are no I/O exchanges in this construction.

[0252] The input and output nodes are called the “stage-1 node” and“stage-2 node”, respectively, and the I/O of a stage-1 node (resp.stage-2 node) are called “stage-1 I/O” (resp. “stage-2 I/O”). When everynode in 2Stg(m, n) is replaced by a switch, the result is an nm×nmswitching network.

EXAMPLE

[0253] As illustrated in FIG. 8, an interconnection line connects everynode in the horizontal plane to every node in the perpendicular plane,respectively. By convention, it can be assumed that signals enter thenetwork from the left. Thus, the eight nodes (801) in the horizontalplane are called the stage-1 nodes, and the two nodes (802) in theperpendicular plane are called the stage-2 nodes, resulting in 2Stg(2,8) (800). When every node is replaced by a switch, the result is a 16×16switching network.

[0254] 2. Addressing schemes and coordinate interchange

[0255] By convention, the input nodes of a 2Stg(m, n) are labeled byy=0, 1, . . . , n−1 and output nodes by x=0, 1, . . . , m−1, as the samemanner employed in FIG. 8. Recall from the Definitions A20, the nodeordering at each of the two stages naturally induces an ordering on theI/O at that stage, which appears as an array of addresses 0, 1, 2, . . ., arranged in the top-down manner in the conventional graphrepresentation. Therefore, under the “linear addressing scheme” of2Stg(m, n), the x^(th) I/O of the y^(th) input node, 0≦x<m and 0≦y<n, isat address my+x, and the y^(th) I/O of the x^(th) output node is ataddress nx+y. The range is from 0 to mn−1. The interstage exchange isthe mapping: my+x→nx+y.

[0256] Under the “vector addressing scheme” of 2Stg(m, n), the x^(th)I/O of the y^(th) input node is at the vector address (y, x), and they^(th) I/O of the x^(th) output node is at the vector address (x, y),for 0≦x<m and 0≦y<n. The aforementioned linear address follows thelexicographic order of the vector address. In particular, the linearaddresses of stage-1 I/O follows the (y, x) lexicographic order ofstage-1 I/O, and the linear addresses of stage-2 I/O follows the (x, y)lexicographic order of stage-2 I/O. The interstage exchange, in terms ofthe vector address, is simply the interchange between the x and ycomponents of the vector address:

[0257] (y, x)→(x, y).

[0258] For this reason, the interstage exchange inside the 2-stageinterconnection network is also referred as the “coordinateinterchange”, even when no particular addressing scheme is specified.

EXAMPLE 2

[0259] A 2Stg(m, n) with m=3 and n=5 can be represented by each of theaforementioned addressing schemes. FIG. 9 shows the network 900 underthe linear addressing scheme, in which the stage-1 I/O (902, 903) andstage-2 I/O (904, 905) are addressed in the naturally induced I/O order.The element 901 is the interstage exchange which connects each stage-1output port with the address in the form 3y+x, e.g. 11=3×3+2, to thestage-2 input port with the address 5x+y, e.g. 5×2+3=13, for x=0, 1, 2and y=0, 1, 2, 3, 4. When represented under the vector addressing schemeas in FIG. 10, in which the addresses of the stage-1 (1002, 1003) andstage-2 (1004, 1005) nodes of the network 1000 are shown in2-dimentional vector form, one can readily see that the interstageexchange 1001, also named as coordinate interchange, maps each stage-1output address in the form (y, x) to the corresponding stage-2 inputaddress (x, y), thus the interchange of the coordinates in the vectoraddresses is clear.

[0260] 3. 2X and X2 interconnection networks

[0261] For the plain 2-stage interconnection network, the defaultexternal I/O order (Definition A21) follows the (y, x) lexicographicorder of stage-1 input addresses and the (x, y) lexicographic order ofstage-2 output addresses. Two other systems of external I/O order forthe 2-stage interconnection network are described as follows.

[0262] Definition B2: “2X interconnection network”. The “(y, x) system”of external I/O order of the 2Stg(m, n) follows the (y, x) lexicographicorder of both stage-1 input addresses and stage-2 output addresses. Thissystem differs from the default system only in the external outputorder. Recall from the Definition A22, the external output order in the(y, x) system, since being different from the default external outputorder, induces an output exchange. This output exchange converts fromthe (x, y) lexicographic order on stage-2 outputs to the (y, x)lexicographic order on external outputs; thus it is the inversecoordinate interchange, that is, an mirror image of the interstageexchange. The same construction procedure as the plain 2-stageinterconnection, but with the inverse coordinate interchange appended asthe output exchange, is referred to as the 2-stage interconnection withan output exchange, or simply as the “2X interconnection”. A network soconstructed is called a “2X interconnection network”. The 2X version ofa 2Stg(m, n), that is, the 2X interconnection network with parameter mand n, is denoted as 2X(m, n). Data signal progresses through a generic2X interconnection network along the path specified by path diagram 1100in FIG. 11A.

EXAMPLE 3

[0263] A 2X version of 2Stg(3,5) is the 1200 as shown in FIG. 12. Theoutput exchange 1202, which is the inverse of the coordinate interchange1201, is appended to the 2Stg(3,5) (1000) in FIG. 1O.

[0264] Definition B3: “X2 interconnection network”. The “(x, y) system”of external I/O order of the 2Stg(m, n) follows the (x, y) lexicographicorder of both stage-1 input addresses and stage-2 output addresses. Thissystem differs from the default system only in the external input order.The external input order in the (x, y) system, since being differentfrom the default external input order, induces an input exchange. Thisinput exchange converts from the (y, x) lexicographic order on stage-1inputs to the (x, y) lexicographic order on external inputs, thus it isagain the inverse coordinate interchange, that is, an mirror image ofthe interstage exchange. The same construction procedure as the plain2-stage interconnection, but with the inverse coordinate interchangeprepended as the input exchange, is referred to as the 2-stageinterconnection with an input exchange, or simply as the “X2interconnection”. A network so constructed is called an “X2interconnection network”. The X2 version of a 2Stg(m, n), that is, theX2 interconnection network with parameter m and n, is denoted as X2(m,n). Data signal progresses through a generic X2 interconnection networkalong the path specified by path diagram 1110 in FIG. 11B.

EXAMPLE 4

[0265] An X2 version of 2Stg(3,5) is the network 1300 as shown in FIG.13. The input exchange (1302), which is the inverse of the coordinateinterchange (1301), is prepended to the 2Stg(3,5) (1000) in FIG. 10.

[0266] The above three types of networks and the correspondingconstruction procedures will be regarded as three versions of “2-stageinterconnection network” and “2-stage interconnection”, respectively.

[0267] Since the existence of the input exchange or output exchange in a2-stage interconnection network is basically due to the differentordering systems adopted by the network, the I/O exchanges can beimplemented, as alluded to in the Definition A22, either in virtual byaddress labeling or in real by physical wiring. In graph representation,however, the I/O exchanges are always explicitly drawn in the mannershown in FIGS. 11 and 12.

[0268] 4. Generalization of 2-stage interconnection

[0269] Recall that the routability of an interconnection network onlydepends on the intrinsic internal connectivity of the network; thus forany multi-stage network, the routability depends on its interstageexchanges only, and for a 2-stage network, in particular, depends onlyon its single interstage exchange. Specifically, the necessary conditionfor ensuring the routability of any 2-stage interconnection network isthe existence of an interconnection line from every input node to everyoutput node, or equivalently, the condition is that the output ports ofeach input node are linked with distinct output nodes, and the inputports of each output node are linked with distinct input nodes. Recallthat the interstage exchange of a 2Stg(m, n) is the coordinateinterchange, which requires the existence of an interconnection linefrom the x-th output port of the y-th input node to the y-th input portof the x-th output node for 0≦x<m and 0≦y<n, and the routability is thusguaranteed. It is clear that the coordinate interchange is just aspecial case of those interstage exchanges preserving the routability ofa 2-stage interconnection network. The reason for adopting thecoordinate interchange as the interstage exchange is the translationfrom the 3-dimensional representation of two orthogonal stacks of planesto the planar graph representation. This reason alone of course does notpreclude alternative interstage exchanges, as long as they alsoguarantee the routability. Therefore, a “generalized 2-stageinterconnection network” is a 2-stage network interconnected in such away that its interstage exchange fulfils the aforementioned necessarycondition for routability, and such kind of interconnection is calledthe “generalized 2-stage interconnection”. In short, a generalized2-stage interconnection network is just a routable 2-stage network.

[0270] Note that the 2-stage interconnection network of any version caneven be generalized in such a way that the input node can be of size p×mand the output node can be of size n×q, where p may or may not be equalto m, and q may or may not be equal to n. Then the overall network wouldbe of size pn×mq, and is said to be with parameter m, n, p, and q. Whenevery node is replaced by a switch, the result is a pn×mq switchingnetwork. For simplicity, the 2-stage interconnection networks of anyversion appearing in the context are of the type with parameter m and nonly.

[0271] 5. Recursive 2-stage construction

[0272] Definition B4. “plain 2-stage tensor product, 2X tensor product,and X2 tensor product between two multi-stage networks”. Let Φ be an M×Mi-stage network and Ψ an N×N j-stage network. Fill the role of eachinput node in a plain 2-stage interconnection network with parameter Mand N (2Stg(M, N)) with a copy of Φ and each output node with Ψ. Ungroupnodes and lines inside every node so that they become elements directlybelonging to the whole construction. The result is a MN×MN (i+j)-stagenetwork, which is called the “plain 2-stage tensor product of (Φ and Ψ”.

[0273] If the plain 2-stage interconnection network (2Stg(M, N)) in thisdefinition is replaced by the 2X interconnection network with parameterM and N (2X(M, N)), then the resulting M×MN (i+j)-stage network iscalled the “2X tensor product of Φ and Ψ”.

[0274] If the 2Stg(M, N) in the definition is replaced by X2(M, N), thenthe resulting MN×MN (i+j)-stage network is called the “X2 tensor productof (Φ and Ψ”.

[0275] The above three types of tensor products will be regarded asthree versions of “2-stage tensor product”.

[0276] Similar to the 2-stage interconnection networks, 2-stage tensorproduct of any version can also be generalized to be the tensor productof a P×M network and a N×Q network, resulting a PN×MQ network, but theimmediate focus is still on the type with parameter M and N only.

[0277] For example, if we let Φ be a 3×3 single node network and Ψ be a5×5 single node network, then the plain 2-stage tensor product of Φ andΨ would be the 15×15 2-stage network 1000 shown in FIG. 10, the 2Xtensor product of Φ and Ψ would be the 15×15 2-stage network 1200 shownin FIG. 12, and the X2 tensor product of Φ and Ψ would be the 15×152-stage network 1300 shown in FIG. 13.

[0278] In the above definition, the network Φ may be by itself a tensorproduct of two smaller networks and so may be Ψ. Thus the mechanism offorming tensor products can be recursively invoked. Through a recursiveprocedure in forming tensor products, a large multi-stage network can beconstructed from smaller multi-stage networks and ultimately fromsingle-node networks. The following terminology is employed throughoutthe context. The recursive procedure in forming tensor products toconstruct a large multi-stage network is referred to as the “recursiveapplications of 2-stage interconnection” or “recursive 2-stageconstruction”, or even simply “recursive construction” when 2-stageconstruction is understood in the context; the network so constructedfrom single-node networks is referred to as the “recursive 2-stageinterconnection network”. When referring to a particular one of thethree types of the formation of tensor products, the terms “recursiveplain 2-stage construction” (“recursive plain 2-stage interconnectionnetwork”), “recursive 2X construction” (“recursive 2X interconnectionnetwork”), and “recursive X2 construction” (“recursive X2interconnection network”) are correspondingly used. The single-nodenetworks in the recursive construction are referred to as the “basicbuilding blocks” or simply “building blocks” of the recursiveconstruction. In general, the basic building blocks may include nodes ofany size, as shown in FIG. 14, which includes 2×2, 3×3 and 5×3 nodes asbasic building blocks. A special case of particular interest is when allbasic building blocks are 2×2 nodes; the recursive construction thenleads to a 2^(k)×2^(k) k-stage network for some k.

EXAMPLE 5

[0279]FIG. 14 shows how a 30×18 network is constructed from therecursive 2-stage construction with basic building blocks being 2×2, 3×3and 5×3 nodes in two steps. Step 1: from the plain 2-stage tensorproduct of 2×2 single node network 1401 and 3×3 single node network1402, a 6×6 network 1403 is resulted. Step 2: the plain 2-stage tensorproduct of the 6×6 network 1403 resulted in step 1 and 5×3 single nodenetwork 1404 gives the desired 30×18 network 1400.

[0280] The procedures in this recursive 2-stage construction can belogged by a binary tree diagram as shown in FIG. 15. “Binary tree” is afundamental concept in computer science and can be found in any standardtextbooks in computer science, especially those on data structures. Thestandard terms concomitant to this concept include “node of a tree”,“root”, “leaf”, “internal node”, “sub-tree”, “left son”, and “rightson”. The meanings of the terms adopted in this context are given asfollows: Every binary tree is rooted. The “root” is the unique node inthe tree without a “father” (parent node). Every node (including theroot) of a binary tree has either 0 or 2 “sons” (child nodes) and isaccordingly called a “leaf” (with 0 sons) or an “internal node” (with 2sons). A binary tree can be as small as a single-node tree, that is, itcontains the “root” only. A node J is called a “descendant” of a node Kif either J=K or, recursively, J is a descendant of a son of K. In abinary tree, a sub-tree rooted at a node J is the part of the binarytree spanning all of the descendants of J. Legitimate sub-tree of abinary tree can be as small as a leaf or as large as the entire tree.Every sub-tree of a binary tree is a binary tree. A binary tree can berepresented by a planar graph with the root at the top level and everyother node at one level lower than its father. In such a representation,the two sons of an internal node are called the “left-son” and the“right-son” according to their positions in the graph representation.

[0281] On the tree 1510 in FIG. 15 are a root 1511, an internal node1512, and three leaves 1513, 1514, and 1515. The three leaves 1513,1514, and 1515 correspond, respectively, to the three basic buildingblocks, that is, the 2×2 network 1401, the 3×3 network 1402, and the 5×3network 1404 in FIG. 14. The sub-tree 1516 rooted at the internal node1512 corresponds to the intermediate 6×6 network 1403, and the entirebinary tree 1510 corresponds to the overall 30×18 network 1400. From theconstruction point of view, the internal node 1512 represents the firststep in the above recursive 2-stage construction, that is, the step ofconstructing the 6×6 sub-network 1403 from the tensor product (plain2-stage tensor product here) of the 2×2 network 1401 and the 3×3 network1402, wherein the 2×2 network 1401 corresponds to the sub-tree 1517rooted at the node 1513, and the 3×3 network 1402 corresponds to thesub-tree 1518 rooted at the node 1514. The root node 1511 represents thesecond and final step of the recursive construction. This stepconstructs the final 30×18 network 1400 from the plain 2-stage tensorproduct of the 6×6 network 1403 (corresponding to the sub-tree 1516rooted at 1512) and the 5×3 network 1404 (corresponding to the sub-tree1519 rooted at 1515). As a whole, the tree 1510 logs the overallprocedure of the above recursive 2-stage construction.

[0282] A recursive 2-stage construction logged by a binary tree yields arecursive 2-stage interconnection network, provided a network isprescribed corresponding to each leaf in a binary tree. The binary treeis then said to be “associated with” the recursive 2-stageinterconnection network so constructed with the prescribed networks as“building blocks” of the construction. The correspondence between arecursive 2-stage construction and its associated binary tree can bebest elucidated and concretized by the illustration of FIG. 14 and FIG.15 in Example 5 as above. Note that the binary tree is used here only tolog the precedence among the recursive steps of the construction anddoes not explicitly require the tensor product employed at eachrecursive step to be plain 2-stage tensor product. In other word, 2X orX2 tensor product applies as well.

[0283] Recall that a special case of particular interest is when allbuilding blocks in the recursive 2-stage construction are single cells(2×2 nodes). Then, the result is a 2^(k)×2^(k) k-stage network, where kis the number of leaves in the associated binary tree. This special caseleads to the definition below.

[0284] Definition B5. “recursive plain 2-stage interconnection networkof cells”. “recursive 2X interconnection network of cells” and“recursive X2 interconnection network of cells”. A 2^(k)×2^(k) k-stagenetwork constructed from recursively forming plain 2-stage tensorproducts using single cells as building blocks is called a “recursiveplain 2-stage interconnection network of cells”, and the correspondingrecursive procedure is called the “recursive plain 2-stage constructionfrom cells”. A 2^(k)×2^(k) k-stage networks constructed from recursive2X tensor products using single cells as building blocks is called a“recursive 2X interconnection network of cells”, and the correspondingrecursive procedure is called the “recursive 2X construction fromcells”. A 2^(k)×2^(k) k-stage networks constructed from recursive X2tensor products using single cells as building blocks is called a“recursive X2 interconnection network of cells”, and the correspondingrecursive procedure is called the “recursive X2 construction fromcells”. Note that when there is no need to specify the type of tensorproducts in the recursion, the terms “recursive 2-stage interconnectionnetwork of cells” and “recursive 2-stage construction from cells” areused collectively.

EXAMPLE 6

[0285] FIGS. 16-19 show how 8×8 3-stage network 1600 is built as arecursive X2 interconnection network of cells. While Example 5 shows therecursion from bottom to top, that is, from building smaller networkthen larger network, this example shows the reverse way. So startingfrom building larger network, the 8×8 network 1600 can be constructed asan X2 tensor product of 2×2 network 1601 and 4×4 network 1602 as shownin FIG. 16. Then, as shown in FIG. 17, each 4×4 network 1611 canrecursively be an X2 tensor product of 2×2 networks (or cells) 1612.Then ungrouping the nodes and lines inside every 4×4 node 1611 so thatthey become elements directly belonging to the whole construction 1621as shown in FIG. 18. Now each node 1622 in the construction is a cell sothe resulting 8×8 network 1600 is a recursive X.2 interconnectionnetwork of cells. Usually, it will be redrawn into an equivalent versionwith better appearance, as the network 1600 shown in FIG. 19. The reasonis that, unlike recursive plain 2-stage construction, in a recursive 2Xor X2 construction, the stack of either the input exchanges or theoutput exchanges of the smaller networks will concatenated with thelarge exchange in the tensor product. As a common practice, thesuccessive exchanges will be replaced by the single exchange which isthe product of these exchanges, that is, graphically, each zigzag lineis straightened into a straight line. Therefore, in this example, theresulting 8×8 exchange 1631 in FIG. 19 is the product of the 8×8exchange 1623 of FIG. 18, which results from stacking the 4×4 inputexchange 1624 from the upper 4×4 network and the 4×4 input exchange 1625from the lower 4×4 network, and the 8×8 interstage exchange 1626. Thebinary tree associated with this recursive X2 interconnection network ofcells are shown as the tree 2000 in FIG. 20.

[0286] C. BANYAN-TYPE NETWORKS AND TRACE AND GUIDE OF A BIT-PERMUTINGNETWORK

[0287] 1. Permutation on integers

[0288] Definition C1: “permutation”. A “permutation” c on integers from1 to n is a one-to-one function from the set { 1, 2, . . . , n} toitself. The “image” of a number k under the permutation c is denoted asσ(k). For example, consider the permutation σ on the integers 1, 2, 3,and 4 such that σ(1)=4, σ(2)=3, σ(3)=1, and σ(4)=2. This permutation σcan be expressed as 1

4

2

3

1, wherein the notation “a

b” means that a is mapped to b under σ. The “cycle representation”simplifies the notation as σ=(1 4 2 3). Note that by “cyclerepresentation”, the expression σ=(1 4 2 3) is totally equivalent withσ=(4 2 3 1) or σ=(2 3 1 4) or σ=(3 1 4 2). Multiplication of twopermutations σ and π is customarily defined as the functionalcomposition from left-to-right: (σπ)(k)=π(σ(k)). For example, if σ=(1 42 3) and π=(2 3), then (σπ)(4)=π(σ(4))=π(2)=3.

[0289] There are altogether n! permutations on integers from 1 to n. Inthe terminology of modern algebra, they form a “group” undermultiplication. The identity mapping, denoted as “id”, is regarded asone of the permutations. Every permutation is invertible, that is, forevery permutation σ, there exists a unique permutation τ such στ=id=τσ.In that case, τ is called the inverse of σ and is written as τ=σ⁻¹. Forexample, given the permutation σ=(1 4 2 3) as above, then σ⁻¹(k) meanswhichever number mapped to k under the permutation σ, for every k, andσ⁻¹=(3 4 2 1).

[0290] 2. Bit-permuting exchange

[0291] A permutation σ on integers from 1 to n “induces” a 2^(n)×2^(n)exchange X_(σ) via

X_(σ):b_(σ(1))b_(σ(2)) . . . b_(σ(n))

b₁b₂ . . . b_(n)

[0292] wherein the notation “a

b” immediately above means that a is mapped to b by the exchange. Themnemonic interpretation of X_(σ) is as follows: the value of the j^(th)bit of the binary string before the exchange X_(σ) gives the value ofthe σ(j)^(th) bit of the corresponding binary string afterwards.

[0293] An equivalent formula for X_(σ) is

X_(σ):b₁b₂ . . . b_(n)

b_(σ)−1₍₁₎b_(σ)−1₍₂₎ . . . b_(σ)−1_((n)).

EXAMPLE 1

[0294] Take the permutation (n n−1 . . . 1) as an example. It maps n ton−1, n−1 to n−2, . . . , 2 to 1, and 1 to n. Thus it induces thefollowing 2^(n)×2^(n) exchange:

X_((n n−1 . . . 1)):b₁b₂ . . . b_(n)

b₂ . . . b_(n−1)b_(n)b₁

[0295] This is called the 2^(n)×2^(n) “shuffle exchange”, which meansthe left-rotation of every n-bit number by one bit. The 8×8 exchange2101 shown in FIG. 21A is the exchange X₍₃ ₂ ₁₎, or the 8×8 shuffleexchange.

[0296] Another example is one wherein the permutation (3 1) induces 8×8exchange 2103 shown in FIG. 21C. Under this exchange, the value of the1^(st), 2^(nd) and 3^(rd) bit of the bit pattern before the exchangegives the value of the 3^(rd), 2^(nd) and 1^(st) bit of the bit patternafter the exchange, respectively.

[0297] Definition C2: “bit-permuting exchange”. A 2^(n)×2^(n)“bit-permuting exchange” is an exchange induced by a permutation onintegers from 1 to n.

[0298] The “rank” of a nonidentity permutation σ on integers from 1 to nmeans the smallest number d such that σ(d)≠d.

[0299] For 1≦d<n, the exchange X_((n n−1 . . . d)) is called the2^(n)×2^(n) “shuffle exchange of rank d” and denoted as SHUF^((n)) _(d).In particular, the 2^(n)×2^(n) shuffle exchange of rank 1 is simply the2^(n)×2^(n) shuffle exchange SHUF^((n)). Similarly, for 1≦d<n, theexchange X_((d d+1 . . . n)) is called the 2^(n)×2^(n) “inverse shuffleexchange of rank d” and denoted by (SHUF^((n)) _(d))⁻¹.

[0300] For 1≦d<n, the 2^(n)×2^(n) exchange X_((n d)) is called the2^(n)×2^(n) “banyan exchange of rank d” and denoted as BANY^((n)) _(d).In particular, the 2^(n)×2^(n) banyan exchange of rank 1 is simplycalled the 2^(n)×2^(n) banyan exchange and denoted as BANY^((n)).

[0301] Denote by σ

^((n)) the permutation that performs the end-to-end swap on the sequence1, 2, . . . , n, that is, σ

^((n))(j)=n+1−j for all j. In the cycle notation, σ

^((n))=(1 n)(2 n−1) . . . (└n/2┘┌n/2┐) (where └.┘ is the “floor” and ┌.┐is the “ceiling”). The exchange induced by this permutation is calledthe 2^(n)×2^(n) “swap exchange” and denoted as SWAP^((n)).

[0302] For example, the 8×8 exchanges 2101 as in FIG. 21A, 2102 as inFIG. 21B, 2103 as in FIG. 21C, and the 16×16 exchanges 2104 as in FIG.21D show the graph representations of SHUF⁽³⁾ (=X₍₃ ₂ ₁₎), (SHUF⁽³⁾)⁻¹(=X₍₁ ₂ ₃₎), BANY⁽³⁾ (=X₍₁ ₃₎), and SWAP⁽⁴⁾ (=X₍₁ ₄₎₍₂ ₃₎),respectively. Note that SWAP⁽³⁾ (=X₍₁ ₃₎) happens to be identical withBANY⁽³⁾. Therefore, the 8×8 exchange 2103 in FIG. 21C also representsSWAP⁽³⁾.

[0303] The product between two exchanges each induced by a permutationis the exchange induced by the product between the two permutations.Thus let σ and π be permutations, then X_(σ)X_(π)=X_(σπ). This isillustrated in FIG. 6E, where the product between the 16×16 exchangesX₍₂ ₄₎ (691) and X₍₄ ₃ ₂ ₁₎ (692) yields the 16×16 exchange X₍₁ ₄₎₍₂ ₃₎(693). The product of the same two exchange but in reversed order, thatis, the exchange X₍₄ ₃ ₂ ₁₎ (692) is now in front of the exchange X₍₂ ₄₎(691), as shown in FIG. 6F, results in a different exchange exchangesX₍₄ ₃₎₍₂ ₁₎ (694).

[0304] 3. Bit-permuting network

[0305] Definition C3: “bit-permuting network”. A 2^(n)×2^(n) multi-stageinterconnection network is called a “bit-permuting network” if everystage consists of 2^(n−1) 2×2 nodes and every exchange in the network isbit-permuting.

[0306] For example, the 16×16 11-stage network with eight 2×2 nodes ineach stage and a shuffle exchange between every two consecutive stagesis a bit-permuting network.

[0307] A 2^(n)×2^(n) k-stage bit-permuting network can be completelydetermined by specifying all the inducing permutations of the exchangesof the network. Thus a 2^(n)×2^(n) k-stage bit-permuting network isdenoted as [σ₀:σ₁:σ₂: . . . :σ_(k−1):σ_(k)]_(n), where the permutationσ_(j), 1≦j<k, induces the exchange between the j^(th) and (j+1)^(th)stages, the permutation σ₀ induces the input exchange, and permutationσ_(k) induces the output exchange. A colon in this notation symbolizes astage of 2×2 nodes. When there is no ambiguity, the subscript n in thenotation can be omitted.

[0308] For example, network 2200 shown in FIG. 22 (which is also thestructure of FIG. 7) is denoted as [id:(4 3 2 1):(1 4 2 3):(3 4):id]₄.When the input exchange or the output exchange is induced by permutation“id”, i.e., when the exchange is absent, it may be omitted in thenotation. So [id:(4 3 2 1):(1 4 2 3):(3 4):id]₄ may be written simply as[: (4 3 2 1):(1 4 2 3):(3 4):]₄. Meanwhile, the network [:]₁ is a single2×2 node without I/O exchanges.

[0309] The two bit-permuting networks [σ₀:σ₁: . . . :σ_(k−1):σ_(k)]_(n)and [σ_(k) ⁻¹: σ_(k−1) ⁻¹: . . . :σ₁ ⁻¹:σ₀ ⁻¹]_(n) are “mirror images”of each other.

[0310] 4. Banyan-type network

[0311] Definition C4: “banyan-type network”. A 2^(n)×2^(n) n-stage,routable, bit-permuting network is called a “banyan-type network”.

[0312] For instance, a special case of a banyan-type network called the2^(n)×2^(n) “banyan network” is the 2^(n)×2^(n) n-stage network withoutI/O exchanges such that the sequential interstage exchanges are2^(n)×2^(n) banyan exchanges of increasing ranks:

[:(n 1):(n 2): . . . :(n n−2):(n n−1):]_(n)

[0313] The 2^(n)×2^(n) “baseline network” is the 2^(n)×2^(n) n-stagenetwork without I/O exchanges such that the sequential interstageexchanges are 2^(n)×2^(n) inverse shuffle exchanges of increasing ranks:

[:(1 2 . . . n−1 n):(2 3 . . . n−1 n): . . . :(n−2 n−1 n):(n−1 n):]_(n)

[0314] The 2^(n)×2^(n) “Omega network” or “shuffle-exchange network” isthe 2^(n)×2^(n) n-stage network without I/O exchanges such that everyinterstage exchange is the shuffle exchange:

[:(n n−1 . . . 2 1):(n n−1 . . . 2 1): . . . :(n n−1 . . . 2 1):]

[0315] The mirror images of the banyan, baseline, and Omega networks arethe “reverse banyan”, “reverse baseline”, and “reverse Omega” networks,respectively. Thus the interstage exchanges in the 2^(n)×2^(n) reversebanyan network are 2^(n)×2^(n) banyan exchanges of decreasing ranks;those in the reverse baseline network are 2^(n)×2^(n) shuffle exchangesof decreasing ranks; and those in the reverse Omega network are all2^(n)×2^(n) inverse shuffle exchanges.

[0316] For example, the network 2300 of FIG. 23 illustrates a [:(3 21):(3 2 1):] which is a 8×8 shuffle exchange network which belongs tothe family of 8×8 banyan-type networks.

[0317] The following two points highlight the extra qualification of abanyan-type network over the qualification of a bit-permuting network:

[0318] (1) A 2^(n)×2^(n) banyan-type network must be in exactly nstages, while a 2^(n)×2^(n) bit-permuting network can be in an arbitrarynumber of stages.

[0319] (2) A banyan-type network must be routable, while a bit-permutingnetwork may possibly be non-routable, as illustrated by the followingexample.

EXAMPLE 2

[0320] Despite its appearance, the 16×16 4-stage network 2400 in FIG.24, denoted as [:(3 4):(1 4):(4 3 2 1):]₄, is not routable. Everyexternal input in it can access only half of the external outputs. Infact, the network 2400 is the overlay of two logically disjoint copiesof the 8×8 4-stage network [:(2 3):(1 3):(3 2 1) :]₃. Cells in thenetwork 2500 in FIG. 25 constitute one copy of [:(2 3):(1 3):(3 2 1):]₃,and cells in the network 2600 in FIG. 26 constitute the other copy.

[0321] Bit-permuting 2-stage interconnection

[0322] The coordinate interchange of a 2Stg(m, n) can be expressed as abit-permuting exchange if both m and n are power of 2. In particular, ifm=2^(k−r), and n=2^(r), that is, a 2-stage interconnection networkcomposed of 2^(r) 2^(k−r)×2^(k−r) input nodes and 2^(k−r) 2^(r)×2^(r)output nodes, the coordinate interchange is the r^(th) power ofSHUF^((k)). For example, as shown in FIG. 16, the interstage exchange1603 of the network 1600, which is the X2 version of a 2-stageinterconnection network with parameter m=2=2³⁻² and n=4=2², is X₍₁ ₂ ₃₎,wherein the inducing permutation is (1 2 3) which is the 2^(nd) power ofSHUF⁽³⁾, i.e. (3 2 1)²=(3 2 1)(3 2 1)=(1 2 3).

[0323] Recall from the section B4 that a generalized 2-stageinterconnection network with parameter m and n is just a routable2-stage network whose interstage exchange can be in any form as long asit connects each of the m output ports on each input node to a distinctone of the m output node and each of the n input ports on each outputnode to a distinct one of the n input node. Similar to above, theinterstage exchange of a generalized 2-stage interconnection networkwith parameter m and n can be expressed as a bit-permuting exchange ifboth m and n are power of 2. When the interstage exchange of ageneralized 2-stage interconnection network is a bit-permuting exchange,the network is called a “bit-permuting 2-stage interconnection network”.In particular, for a bit-permuting 2-stage interconnection network withparameter 2^(k−r) and 2^(r), the interstage exchange is induced by apermutation σ on integers from 1 to k such that

[0324] σ maps the numbers r+1, r+2, . . . , k into the set {1, 2, . . ., k−r}, or equivalently,

[0325] σ maps the numbers 1, 2, . . . , r into the set {k−r+1, k−r+2, .. . , k}.

[0326] Note that by recursive application of bit-permuting 2-stageinterconnections, the resulting network is a banyan-type network.

[0327] 5. Trace and guide of a bit-permuting network

[0328] Many attributes of a bit-permuting network are more convenientlyrendered in the “trace” and/or “guide”. These attributes include: (a)routability; (b) routing control; (c) network equivalence underintra-stage cell rearrangement; and (d) various conditional non-blockingproperties of switch realization.

[0329] The 2^(n−1) cells at each stage of the multi-stage network[σ₀:σ₁:σ₂: . . . :σ_(k−1): σ_(k)]_(n) are linearly ordered. The addresslabels are integers from 0 to 2^(n−1)−1 or, equivalently, the (n−1)-bitnumbers. On the cell at the address b₁b₂ . . . b_(n−1), the two inputsare at the n-bit addresses b₁b₂ . . . b_(n−1)0 and b₁b₂ . . . b_(n−1)1and so are the two outputs.

[0330] Definition C5: “trace and guide”. For a k-stage 2^(n)×2^(n)bit-permuting network, the trace and the guide of the bit-permutingnetwork [σ₀:σ₁: . . . :σ_(k−1):σ_(k)]_(n) are both sequences of knumbers wherein each number is an integer from 1 to n via the followingformulas:

[0331] The “trace” is the sequence

σ₀ ⁻¹(n), (σ₀σ₁)⁻¹(n), . . . , (σ₀σ₁ . . . σ_(k−2))⁻¹(n), (σ₀σ₁ . . .σ_(k−1))⁻¹(n).

[0332] The “guide” is the sequence

(σ₁σ₂ . . . σ_(k))(n), (σ₂σ₃ . . . σ_(k))(n), . . . , (σ_(k−1)σ_(k))(n),σ_(k)(n).

[0333] In general, for 1≦j≦k, the j^(th) term of the trace is (σ₀σ₁ . .. σ_(j−1))⁻¹(n) and the j^(th) term of the guide is (σ_(j)σ_(j+1) . . .σ_(k))(n).

[0334] The two sequences are very closely related. For a bit-permutingnetwork [σ₀:σ₁: . . . :σ_(k−1):σ_(k)]_(n), when the permutation σ₀σ₁σ₂ .. . σ_(k) is applied to the trace term by term, the guide results.Conversely, when the permutation (σ₀σ₁σ₂ . . . σ_(k))⁻¹ is applied tothe guide term by term, the trace results.

[0335] Note that the reversed sequence of the trace of the network[σ₀:σ₁: . . . : σ_(k−1):σ_(k)]_(n) is the guide of the network [σ_(k)⁻¹:σ_(k−1) ⁻¹: . . . :σ₁ ⁻¹:σ₀ ⁻¹]_(n), which is the mirror-imagenetwork.

EXAMPLE 3

[0336] Let the trace and the guide of the 16×16 banyan-type network[id:(3 4):(1 4):(2 4):id] be the sequences t₁, t₂, t₃, t₄ and g₁, g₂,g₃, g₄, respectively. Thus t₁=σ₀ ⁻¹(4)=4 since σ₀ ⁻¹=id⁻¹ =id and everynumber is mapped to itself by id; t₂=(σ₀σ₁)⁻¹(4)=3 since (σ₀σ₁)⁻¹=(id(34))⁻¹=(3 4)⁻¹=(4 3) and 4 is permuted to 3 by (4 3); t₃=(σ₀σ₁σ₂)⁻¹(4)=1since (σ₀σ₁σ₂)⁻¹=(id(3 4)(1 4))⁻¹=(3 1 4)⁻¹=(4 1 3), and 4 is permutedto 1 by (4 1 3); and t₄=(σ₀σ₁σ₂σ₃)⁻¹(4)=2 since (σ₀σ₁σ₂σ₃)⁻¹=(id(3 4)(14)(2 4))⁻¹=(3 1 2 4)⁻¹=(4 2 1 3) and 4 is permuted to 2 by (4 2 1 3). Asa whole, the trace is the sequence 4, 3, 1, 2. Similarly,g₁=(σ₁σ₂σ₃σ₄)(4)=((3 4)(1 4)(2 4)id)(4)=(3 1 2 4)(4)=3;g₂=(σ₂σ₃σ₄)(4)=((1 4)(2 4)id)(4)=(1 2 4)(4)=1; g₃=(σ₃σ₄)(4)=((24)id)(4)=(2 4)(4)=2; and g₄=(σ₄)(4)=(id)(4)=4. As a whole, the guide isthe sequence 3, 1, 2, 4. Alternatively, the guide can be calculated fromthe trace by applying the permutation σ₀σ₁σ₂σ₃σ₄ to the trace term byterm. Here σ₀σ₁σ₂σ₃σ₄=id(3 4)(1 4)(2 4)id=(3 1 2 4). Thus g₁=(3 1 24)(t₁)=(3 1 2 4)(4)=3, g₂=(3 1 2 4)(t₂)=(3 1 2 4)(3)=1, g₃=(3 1 24)(t₃)=(3 1 2 4)(1)=2, and g₄=(3 1 2 4)(t₄)=(3 1 2 4)(2)=4. This agreeswith the calculation of the first time.

[0337] Alternatively, a graphical manner for determining the trace andguide is now described with reference to line diagram 2700 in FIG. 27.

[0338] TRACE: The sequence of the original set of n=4 integers in thisbanyan-type network appears in the first row 2701 in order 1, 2, 3, 4(thus n=4 appears automatically at the top of the last column). Secondrow 2702 is obtained by applying the cycle (3 4) to the integers in row2701; the cycle (3 4) appears on the left-hand side between rows 2701and 2702 for reference. Next, third row 2703 is produced by applying thecycle (1 4) to the integers of row 2702; the cycle (1 4) appears betweenrows 2702 and 2703 on the left-hand side for reference. Finally, fourthrow 2704 is generated by applying the cycle (2 4) to the integers of row2703; the cycle (2 4) appears between rows 2703 and 2704 on theleft-hand side for reference.

[0339] To determine the trace:

[0340] (a) in the second row, locate the column of where the integer n=4appears, which is the third column labeled 2713 From the top of column2713, note the sequence of numbers in going from the top to the locationof integer 4. In this case, the sequence is 3-to-4 or 3, 4. The path inthis sequence is shown by dashed line 2721.

[0341] (b) in the third row, locate the column of where the integer n=4appears, which is the first column labeled 2711 From the top of column2711, note the sequence of numbers in going from the top to the locationof integer 4. In this case, the sequence is 1-to-1-to-4 or 1, 1, 4. Thepath in this sequence is shown by dashed lines 2722 and 2723.

[0342] (c) in the fourth row, locate the column of where the integer n=4appears, which is the second column labeled 2712 From the top of column2712, note the sequence of numbers in going from the top to the locationof integer 4. In this case, the sequence is 2-to-2-to-2-to-4 or 2, 2, 2,4. The path in this sequence is shown by dashed lines 2724, 2725, and2726.

[0343] (d) construct “triangle-like” diagram 2750 in the lower left-handside of FIG. 27, as follows:

[0344] (i) first place the integer n=4 on the diagonal at fourlocations;

[0345] (ii) list the sequence from step (a) horizontally, that is,3-to-4, on the second row 2751;

[0346] (iii) list the sequence from step (b) horizontally on third row2752; and

[0347] (iv) list the sequence from step (c) horizontally on fourth row2753; and

[0348] (e) trace 2754 is read as the sequence from top-to-bottom on theleft-hand side of diagram 2750, namely, 4, 3, 1, 2.

[0349] GUIDE: The sequence of the original set of n=4 integers in thisbanyan-type network appears in the first row 2701 in order 1, 2, 3, 4.Second row 2702 is obtained by applying the cycle (3 4) to the integersin row 2701; the cycle (3 4) appears on the left-hand side between rows2701 and 2702 for reference. Next, third row 2703 is produced byapplying the cycle (1 4) to the integers of row 2702; the cycle (1 4)appears between rows 2702 and 2703 on the left-hand side for reference.Finally, fourth row 2704 is generated by applying the cycle (2 4) to theintegers of row 2703; the cycle (2 4) appears between rows 2703 and 2704on the left-hand side for reference.

[0350] To determine the guide:

[0351] (a) in the first row, locate the column of where the integer n=4appears, which is the fourth column labeled 2714 From the place ofappearance of n=4, note the sequence of numbers in going from n=4 to thebottom of the column. In this case, the sequence is 4-to-3-to-3-to-3 or4, 3, 3, 3. The path in this sequence is shown by dashed lines 2731,2732, and 2733.

[0352] (b) in the second row, locate the column of where the integer n=4appears, which is the third column labeled 2713 From the location of n=4in column 2713, note the sequence of numbers in going from n=4 to thebottom of the column. In this case, the sequence is 4-to-1-to-1 or 4,1, 1. The path in this sequence is shown by dashed lines 2734 and 2735.

[0353] (c) in the third row, locate the column of where the integer n=4appears, which is the first column labeled 2711 From the location of n=4in column 2711, note the sequence of numbers in going from n=4 to thebottom of the column. In this case, the sequence is 4-to-2 or 4, 2. Thepath in this sequence is shown by dashed line 2736.

[0354] (d) construct “triangle-like” diagram 2760 in the lowerright-hand side of FIG. 27, as follows:

[0355] (i) first place the integer n=4 on the diagonal at fourlocations;

[0356] (ii) list the sequence from step (a) horizontally, that is,4-to-3-to-3-to-3, on the first row 2761;

[0357] (iii) list the sequence from step (b) horizontally on second row2762; and

[0358] (iv) list the sequence from step (c) horizontally on third row2763; and

[0359] (e) guide 2764 is read as the sequence from top-to-bottom on theright-hand side of diagram 2760, namely, 3, 1, 2, 4.

EXAMPLE 4

[0360] The 16×16 banyan network preceded by the shuffle exchange is [(43 2 1):(1 4):(2 4):(3 4):id]. Both the trace and the guide are themonotonic sequence 1, 2, 3, 4, as calculated in the FIGS. 28A and 28B,respectively.

[0361] 6. Trace and guide of a network constructed by recursive 2-stageconstruction from cells

[0362] Recall the definitions in Section B of recursive plain 2-stage,2X, and X2 constructions from cells. Such constructed networks are allbanyan-type networks. In fact, every recursive 2-stage interconnectionnetwork of cells is a banyan-type network with monotonically decreasingtrace and monotonically increasing guide, every recursive 2Xinterconnection network of cells is a banyan-type network withmonotonically decreasing trace and guide, and every recursive X2interconnection network of cells is a banyan-type network withmonotonically increasing trace and guide.

EXAMPLE 5

[0363] Recall FIG. 19 in section B. The 8×8 banyan-type network 1630 isa recursive X2 interconnection network of cells. The network isexpressed as [(3 2 1):(3 1):(3 2):]. The trace is calculated to be thesequence 1, 2, 3, and the guide is also the sequence 1, 2, 3. Bothsequences are monotonically increasing.

[0364] 7. Interpretation of trace and guide

[0365] To elucidate the import of the trace and guide, it is instructiveto highlight an example of how the stage-by-stage I/O addresses along ageneric route through a 16×16 banyan-type network are obtained.

EXAMPLE 6

[0366]FIG. 29 illustrates a route, shown by the “dark-line path”,through the 16×16 banyan-type network 2900 [id:(3 4):(1 4):(2 4):(4 3 21)]₄ from the origination address binary (I₁I₂I₃I₄)=1100 to thedestination address binary(O₁O₂O₃O₄)=1110. Along this route thestage-by-stage I/O address progresses as follows in Table 1: TABLE 1

[0367] It is noted that the last bit position in the input bits, listedfrom top-to-bottom, is the sequence of bits I₄, I₃, I₁, and I₂. Thesubscripts of these bit positions, read in sequence, are 4, 3, 1, 2,which is the trace. Similarly, the last bit position in the output bits,listed from top-to-bottom, is O₂, O₄, O₁, and O₃. The subscripts ofthese bit positions, read in sequence, are 2, 4, 1, 3, which is theguide. All bits in the stage-j output address are the same as in thestage-j input address except that the rightmost bit is prescribed by theswitching decision of the stage-j cell. For the illustrated network bitsI₄, I₃, I₁, and I₂ of the origination address are rotated to therightmost bit position upon entering cells at the successive stages andare replaced successively by bits O₂,O₄, O₁, and O₃ of the destinationaddress. Again, the subscripts of the input and output sequences of bitsare stipulated by the trace and the guide of the network, respectively.

[0368] Note that both the trace and the guide include all numbers from 1to 4. Thus the sequential bit replacements involve all bits in theorigination and destination addresses. This fact reflects the network'sroutability.

EXAMPLE 7

[0369] Consider 16×16 non-routable network 2400[id:(34):(14):(4321):id]₄ already illustrated in FIG. 24. By thecalculation summarized in FIG. 30A, the trace of this network is thesequence 4, 3, 1, 3. Similarly the guide is the sequence 2, 4, 3, 4 bythe calculation summarized in FIG. 30B. Consider the Table 2 below whichis determined in the same manner as Table 1: TABLE 2

[0370] Another way to view the stage-by-stage progression of the I/Oaddresses along the route as conveyed by Table 2 is diagram 3100 of FIG.31. As depicted, the permutation and replacement of the input bits11121314 in the top row are shown in a top-down manner as the bitsprogress through network 2400 of FIG. 24. The last row shows quiteexplicitly the fact that that there exists a route from an originationaddress binary(I₁I₂I₃I₄) to a destination address binary (O₁O₂O₃O₄) ifand only if I₂=O₁. This undesirable situation occurs because the number2 does not appear in the trace, nor does the number 1 appear in theguide. Hence the bit I₂ is never rotated to the rightmost bit positionand so is never replaced. Eventually it is rotated to the leftmost bitposition. Close scrutiny of the sequential bit substitution finds bit I₃rotated to the rightmost bit position upon entering stage 2 and replacedby a random bit (say Y) at stage 2, while the new bit Y is later rotatedto the rightmost bit position upon entering stage 4 and is overwritten.This fact is reflected in the repeated appearance of the number 3 atboth the second and the fourth terms in the trace.

[0371] In general, the generic term (σ₀σ₁σ₂ . . . σ_(j−1))⁻¹(n) in thetrace and the generic term (σ_(j)σ_(j+1) . . . σ_(k))(n) in the guidecan be interpreted as follows. The bit at position (σ₀σ₁σ₂ . . .σ_(j−1))⁻¹(n) in the origination address is relocated to the rightmostbit position through successive exchanges induced by σ₀, σ₁, σ₂, . . . ,σ_(j−1). The bit is then replaced by a new bit reflecting the switchingdecision at stage j. This new bit is eventually rotated to the bitposition (σ_(j)σ_(j+1) . . . σ_(k))(n) of the final destination throughsuccessive exchanges induced by σ_(j), σ_(j+1), . . . , σ_(k).

[0372] Now suppose that a certain number p appears in the trace exactlythree times, say, p=(σ₀σ₁σ₂ . . . σ_(i−1))⁻¹(n)=(σ₀σ₁σ₂ . . .σ_(j−1))⁻¹(n)=(σ₀σ₁σ₂ . . . σ_(m−1))⁻¹(n), where 1≦i<j<m≦k, and allother numbers are present at least once in the trace. Then the bit atposition (σ₀σ₁σ₂ . . . σ_(i−1))⁻¹(n) in the origination address isrotated to the rightmost bit position and is replaced by a new bit ofthe switching decision of stage i. This new bit is rotated to therightmost bit position and is overwritten by the switching decision atstage j. This switching decision in turn is overwritten at stage m.Finally, the bit of the switching decision at stage m is rotated to thebit position (σ_(m)σ_(m+1) . . . σ_(k))(n) of the final destination. Inthis scenario, switching at stages i and j is redundant. In somemulti-stage switching designs, redundant stages are present for thepurpose of alternate routing.

[0373] 8. Routability of a bit-permuting network

[0374] For k≧n, if either the trace or the guide of the network[σ₀:σ₁:σ₂: . . . : σ_(k−1):σ_(k)]_(n) includes all numbers from 1 to n,so does the other because of the close relationship between the twosequences. In this case, all bits in the origination address arereplaced by switching decisions throughout the stages. Thus every bit inthe destination address reflects the switching decision of some stage,which means that the network is routable. In other words, for any2^(n)×2^(n) bit-permuting network, the routability of the network caneasily be tested by examining either the trace or the guide of thenetwork. If either sequence contains all numbers from 1 to n, then sodoes the other and the network is routable; otherwise, the network isjust the superimposition of a plurality of logically disjoint copies ofsmaller network. An example of non-routable bit-permuting network can berecalled from the network 2400 in FIG. 24.

[0375] In particular, for any 2^(n)×2^(n) banyan-type network, thefollowings are equivalent:

[0376] The network is routable.

[0377] The trace is a sequence of n distinct integers from 1 to n.

[0378] The guide is a sequence of n distinct integers from 1 to n.

[0379] The design of a routable k-stage 2^(n)×2^(n) bit-permutingnetwork involves the selection of a particular sequence of k+1permutations inducing the input exchange, the k−1 interstage exchanges,and the output exchange. When the routability is the only concern forthe design, the choice of the permutation for each exchange is arbitraryas long as the resulting network is routable. When n and k are large,the number of possible permutations for each exchange grows rapidly andhence so does the number of combinations of the k+1 permutations. Thetask for testing the routability by brute force would be difficult. Thedisclosed method for testing the routability of a bit-permuting networkprovides a simple, instant, and systematic solution, accrediting thesimple calculation of trace and guide: a convenient and powerfulanalyzing tools for bit-permuting networks.

[0380] 9. Altering the trace of a banyan-type network by prepending aninput exchange and altering the guide by appending an output exchange

[0381] For a sequence a₁, a₂, . . . , a_(n) of n distinct integers from1 to n, there always exists a unique permutation σ such that σ(j)=a_(j)for all j. For example, if the sequence is 4, 1, 2, 3, then sinceσ(1)=4, σ(2)=1, σ(3)=2 and σ(4)=3, σ can readily be completelydetermined to be the permutation (1 4 3 2). Recall that the trace andthe guide of a 2^(n)×2^(n) banyan-type network [σ₀:σ₁: . . .:σ_(n−1):σ_(n) ] are sequences of n distinct integers from 1 to n. Thusthere exists permutations τ and γ such that the trace is the sequenceτ(1), τ(2), . . . , τ(n) and the guide is the sequence γ(1), γ(2), . . ., γ(n). The permutation X is then said to “induce” the trace of thenetwork, and the permutation γ is said to “induce” the guide.

EXAMPLE 8

[0382] A 2^(n)×2^(n) banyan-type network whose trace and guide are boththe monotonically increasing sequence 1, 2, . . . , n has both the traceand guide induced by id. On the other hand, a 2^(n)×2^(n) banyan-typenetwork whose trace and guide are both the monotonically decreasingsequence n, n−, . . . , 1, has both the trace and guide induced by σ

^((n)), where σ

^((n))=(1 n)(2 n−1) . . . (└n/2┘┌n/2┐).

EXAMPLE 9

[0383] The 16×16 banyan-type network 2900 as shown in FIG. 29 is [id:(34):(1 4):(2 4):(4 3 2 1)]₄. Its trace is the sequence 4, 3, 1, 2 and itsguide is the sequence 2, 4, 1, 3. Thus the trace is induced by τ=(1 4 23) and the guide by γ=(1 2 4 3).

[0384] When a network [σ₀:σ₁: . . . :σ_(n−1):σ_(n)] with trace inducedby τ and guide by γ is prepended with an additional input exchange X_(λ)and appended with an additional output exchange X_(π), the resultingnetwork [λσ₀:σ₁: . . . :σ_(n−1):σ_(n)π] will have the trace induced byτ′ and the guide by γ′ where

τ′(1)=λ⁻¹(τ(1)), τ′(2)=λ⁻¹(τ(2)), . . . , τ′(n)=λ⁻¹(τ(n)) andγ′(1)=π(γ(1)), γ′(2)=π(γ(2)), . . . , γ′(n)=π(γ(n))

[0385] By comparing the expressions on the two sides of the equalitysigns, it is readily seen that τ′=τλ⁻¹ and γ′=γπ. On the other hand, ifτ and τ′ are given λ can then be conversely computed as λ=τ′⁻¹τ.Similarly, π can be calculated from γ and γ′ as π=γ⁻¹γ′. A directconsequence can be drawn that the permutations τ and γ that induce thetrace and the guide of a banyan-type network can be changed to any τ′and γ′, respectively, by simply prepending the network with an inputexchange X_(λ) and appending with an output exchange X_(π), whereλ=τ′⁻¹τ and π=γ⁻¹γ′. In other words, the trace τ(1), τ(2), . . . , τ(n)of any 2^(n)×2^(n) banyan-type network [σ₀:σ₁: . . . :σ_(n−1):σ_(n)] canbe changed to another sequence τ′(1), τ′(2), . . . , τ′(n) by prependingthe network with an input exchange X_(λ) where λ=τ′⁻¹τ; and the guideγ(1), γ(2), . . . , γ(n) of any 2^(n)×2^(n) banyan-type network [σ₀:σ₁:. . . : σ_(n−1):σ_(n)] can be changed to another sequence γ′(1), γ′(2),. . . , γ′(n) by appending the network with an output exchange X_(π)where π=γ⁻¹γ′.

EXAMPLE 10

[0386] For the 8×8 banyan-type network [(2 3):(2 3):(1 3):id]₃, thetrace is induced by τ=(1 2 3) and the guide by γ=(1 2). Meanwhile an 8×8network with monotonically decreasing trace and guide has the traceinduced by τ′=(1 3) and the guide by γ′=(1 3). In order to turn the 8×8banyan-type network into one with monotonically decreasing trace andguide, the required λ can be calculated as τ′⁻¹τ=(1 3)⁻¹(1 2 3)=(3 1)(12 3)=(3 2), and the required π=γ⁻¹γ′=(1 2)⁻¹(1 3)=(2 1)(1 3)=(1 2 3).

[0387] Note that for a general bit-permuting network [σ₀:σ₁: . . .:σ_(k−1):σ_(k)]_(n), whenever the trace is not a sequence of n distinctintegers from 1 to n, and hence neither is the guide, they cannot bewritten as τ(1), τ(2), . . . , τ(n), and γ(1), γ(2), . . . , γ(n), thatis, they are not associated with any pair of permutations τ and γ.However, the trace and the guide of the network will still be alteredwhen the network is prepended with an additional input exchange andappended with an additional output exchange. Let the trace and the guideof a generic bit-permuting network [σ₀:σ₁: . . . :σ_(k−1):σ_(k)]_(n) bet₁, t₂, . . . , t_(k) and g₁, g₂, . . . , g_(k), respectively. Then byprepending an input exchange X_(λ) and appended with an additionaloutput exchange X_(π), the resulting network [λσ₀:σ₁: . . .:σ_(k−1):σ_(k)π]_(n) will have the new trace t′₁, t′₂, . . . , t′_(k)and the new guide g′₁, g′₂, . . . , g′_(k) where t′_(j)=λ⁻¹(t_(j)) andg′_(j)=π(g_(j)), for each j.

[0388] Contrasting the situation of banyan-type networks, the trace andthe guide of a bit-permuting network in general cannot be arbitrarilyaltered by prepending an input exchange and appending an outputexchange. For example, a trace 1, 2, 3, 1 can never be changed toanother trace 1, 2, 3, 2 by this way. On the other hand, if the traceand the guide of a bit-permuting network can be changed to the trace andthe guide of another bit-permuting network by prepending an inputexchange and/or appending an output exchange, the two networks areregarded to be equivalent. In particular, all banyan-type networks areequivalent in this sense, the weakest sense of equivalence. Differentsenses of equivalence among bit-permuting networks and among banyan-typenetworks will be discusses in section Q after the introduction of “cellrearrangement.”

[0389] It should be noted that prepending an input exchange andappending an output exchange can be regarded as altering the originalinput exchange and output exchange, respectively. Recall that the I/Oexchanges are due to the different external I/O orderings from thedefault system, therefore, the alteration of I/O exchanges of a networkcan be realized by either physically prepending or appending a wiring ofexchange pattern or virtually re-labeling the external I/O addresses.

[0390] D. CONDITIONALLY NONBLOCKING SWITCHES

[0391] The definition of a “nonblocking switch” in Section A. 1 can beparaphrased as follow: An m×n switch is said to be “nonblocking” if, forevery sequence of distinct inputs I₀, I₁, . . . , I_(k−1), and everysequence of distinct outputs O₀, O₁, . . . , O_(k−1), where k=min {m,n}, there exists a connection state that concurrently connects eachI_(j) to O_(j) for all j. This section deals with “conditionallynonblocking” switches, which are substitutes for nonblocking switcheswhen the input traffic has been preprocessed so as to meet certain“conditions”. A compressor, a decompressor, an expander, a UCnonblocking switch, etc., to be defined in the sequel, are conditionallynonblocking switches, where the “conditions” pertain to the correlationbetween active input addresses and active output addresses.

[0392] 1. Compressor and decompressor

[0393] Recall from Definition A7 that a switch is said to accommodate acombination of concurrent I/O connections if there exists a connectionstate of the switch that achieves every I/O connection in thecombination. When a combination of concurrent connections isaccommodated by a switch, the I/O connections in the qualifiedconnection state covers, but is not limited to, the combination that isbeing accommodated.

[0394] Definition D1: “compressor” and “decompressor”. An N×N switch iscalled a “compressor switch” (resp. “decompressor switch”), or simply a“compressor” (resp. “decompressor”), if it can accommodate everycombination of k concurrent connections, k≦N, from k distinct inputs,which are referred to as the k “active inputs” and their addresses the“active input addresses”, to k distinct outputs, which are referred toas the k “active outputs” and their addresses the “active outputaddresses”, subject to: there exists a rotation on the ordering of the Noutput (resp. input) addresses such that the following constraints aremet—

[0395] (a) the k active output (resp. input) addresses are consecutiveafter the rotation; and

[0396] (b) the correspondence between active I/O addresses is orderpreserving after the rotation.

[0397] The two constraints, which are some kinds of correlations amongthe active I/O addresses, are collectively referred to as the“compressor constraint” (resp. “decompressor constraint”).

[0398] In other words, upon a connection request of routing k incomingsignals, k≦N, wherein the k incoming signals arrive at k distinct inputports determining the k active input addresses are destined for kdistinct corresponding output ports determining the k active outputaddresses, the compressor (resp. decompressor) can always accommodatethe connection request by activating an appropriate one of itsconnection states as long as the connection request is compliant to thecompressor constraint (resp. decompressor constraint).

[0399] The k concurrent connections in the combination are from distinctinputs and hence all are point-to-point connections, but the connectionstate to accommodate the combination is not necessarily point-to-point.

[0400] The phrase “order preserving” employed by the definition todescribe the correspondence between active I/O addresses means that whenthe active addresses on one side (e.g. input side) are arrangedaccording to an ordering of the addresses, e.g. in the increasing order,then the ordering of the corresponding active addresses on the otherside is also the same, e.g. also increasing. This preservation of theorderings through the I/O correspondence may be subject to a rotation onthe ordering of the addresses on one side.

EXAMPLE 1

[0401] An exemplary connection request compliant to the compressorconstraint is shown in FIG. 32A. Consider the 5×5 switch 3200 in FIG.32A. The five input ports (3201, 3202, 3203, 3204, and 3205) and fiveoutput ports (3206, 3207, 3208, 3209, and 3210) are respectively labeledfrom top to bottom with the addresses 0, 1, 2, 3, and 4 before anyrotation, and the requested connections are “1→3” (means “a connectionfrom input 1 to output 3”), “3→4” and “4→0”, indicated by the arrow3211, 3212 and 3213, respectively. The combination of these threeconnections is compliant to the compressor constraint because, when theordering of the output addresses is rotated in such a way that the fiveoutput ports are labeled from top to bottom as 2, 3, 4, 0, 1, forinstance, as shown in FIG. 32B, then after this rotation, (1) the newaddresses of the three active output ports become 0, 1, and 2, so theyare consecutive; (2) the active connection pairs now become “1→0”, “3→1”and “4→2”, as indicated by the arrow 3221, 3222 and 3223, respectively,and hence the correspondence between active I/O addresses is clearlyorder preserving.

[0402] A compressor/decompressor is a “conditionally nonblocking switch”since it only accommodates certain combinations of concurrentpoint-to-point connections while a nonblocking switch accommodates everysuch combination. Note that the condition (a) is equivalent to thefollowings: imagine when the array of the output (resp. input) ports ofthe switch is bent into a circular ring, the active output (resp. input)ports become consecutive along the ring. The equivalence of condition(b) is illustrated in the following example.

EXAMPLE 2

[0403]FIG. 32C shows five concurrent connections over a compressor. Whenrectangle 3220 representing the compressor is bent into cylinder 3230,as in FIG. 32D, by abutting (or gluing) the top edge of rectangle 3220to the bottom edge, lines representing the five connections can be drawnin a nonintersecting manner because of the constraint (b) above in thecompressor definition. The mirror images of FIG. 32C and 32D show thecase for a decompressor.

EXAMPLE 3

[0404] A 3×3 switch qualifies as a compressor if and only if itaccommodates at least the six combinations of concurrent connectionsdepicted by element 3300 in FIG. 33. Connection states to accommodatethese six combinations can be ({0},{1},{2}), ({1},{2},{0}),({2},{0},{1}), ({1},null,{2}), ({0},null,{1)), ({2},null,{0}). Analternative selection of the connection states is ({0},{1},{2}),({1},{2},{0}), ({2},{0},{1}), ({1},{0},{2}), ({0},{2},{1}),({2},{1},{0)).

EXAMPLE 4

[0405] A 2×2 switch qualifies as a compressor or decompressor if andonly if it includes both the bar and cross states. Thus the switchingcell is both a compressor and decompressor (see FIGS. 2A and 2B). Infact the switching cell is a nonblocking switch unconditionally.

[0406] The similarity between the compressor and the decompressor can beseen from their respective definition that interchanges the words“input” and “output” in the condition (a). Therefore, the mirror imageof a compressor is a decompressor, and vice versa.

[0407] 2. Expander

[0408] Definition D2: “expander”. An N×N switch is called an “expanderswitch”, or simply “expander”, if it can accommodate every combinationof k concurrent connections, k≦N, from k inputs to k distinct outputssubject to: there exists a rotation on the ordering of the N inputaddresses such that the following constraints are met—

[0409] (a) the k active input addresses are consecutive after therotation; and

[0410] (b) let input addresses i and j be connected to outputs addressesp and q, respectively; if i precedes j with respect to the rotatedordering, then p<q.

[0411] The constraint (b) makes the active output addresses a“multi-valued order-preserving function” with respect to the rotatedinput addresses. The two constraints are collectively referred to as the“expander constraint”.

[0412] The concurrent connections in the above definition can be eitherpoint-to-point or multicast, because they are not necessarily fromdistinct inputs. An expander and a decompressor are similar except thata decompressor needs only accommodate combinations of point-to-pointconnections.

EXAMPLE 5

[0413] The multicast connections in element 3400 of FIG. 34 from fiveinput ports to nine output ports can be concurrently accommodated by anexpander since the combination of these connections is compliant to theexpander constraint. As in FIG. 32D, the lines representing theconnections can be drawn in a nonintersecting manner when the rectangleof FIG. 34 is bent into a cylinder.

EXAMPLE 6

[0414] A 2×2 switch from the input array {0,1} to the output array {0,1} qualifies as an expander if an only if it includes at least the fourconnection states ({0},{1}), ({1},{0}), ({0,1}, null), and (null, {0,1})depicted in FIGS. 2C-2F. The 2×2 switch comprising exactly these fourconnection states is called the “expander cell” in Definition A6.

[0415] 3. Upturned versions of compressor, decompressor and expander

[0416] Definition D3: “upturned compressor”, “upturned decompressor”,“upturned expander”. An “upturned compressor” (resp. “upturneddecompressor”) is the same as a compressor (resp. decompressor) exceptthat it is modified by “order reversing” instead of “order preserving”in the constraint (b) in its definition. An “upturned expander” is thesame as an expander except that it is modified by “q<p” instead of “p<q”in the constraint (b) in its definition. In other words, an upturnedcompressor/decompressor/expander means acompressor/decompressor/expander with the input/output/output array inreverse ordering.

[0417] The corresponding constraints are respectively referred to as the“upturned-compressor constraint”, “upturned-decompressor constraint” and“upturned-expander constraint”.

EXAMPLE 7

[0418] Alluded to above, the switching cell is both a 2×2 compressor anddecompressor, and the expander cell is a 2×2 expander. Furthermore,being a nonblocking switch, the switching cell is automatically anupturned compressor and an upturned decompressor, while the expandercell is an upturned expander.

EXAMPLE 8

[0419] A 4×4 switch qualifies as a compressor if and only if itaccommodates at least the sixteen combinations of concurrentpoint-to-point connections depicted by element 3500 of FIGS. 35A-P. Incontrast, a 4×4 switch qualifies as a upturned compressor if and only ifit accommodates at least the sixteen combinations of concurrentpoint-to-point connections depicted by element 3500 as in FIGS. 36A-P.

[0420] 4. UC nonblocking switch and CU nonblocking switch

[0421] The conventional mathematical notation for the set of integersmodulo N is Z_(N). This is a set of N elements arranged in the circularorder and hence is regarded as a “discretized circle of length N”. Afunction ƒ defined over the set {0, 1, . . . N−1} induces a functionover Z_(N) by:

ƒ(x mod N)=ƒ(x)

[0422] This bends the domain {0, 1, . . . N−1} of the function ƒ into adiscretized circle.

[0423] Definition D4: “circular unimodal” function. A permutation overthe set {0, 1, . . . , N−1} is said to be “circular unimodal” if itsinduced function from the discretized circle Z_(N) to {0, 1, . . . ,N−1} possesses only one local maximum and one local minimum.

[0424] In other words, a function ƒ defined over the set {0, 1, . . . ,N−1} is circular unimodal if the sequence ƒ(0), ƒ(1), . . . , ƒ(N−1),when bent into a circle, has only one local maximum and one localminimum. Equivalently, the same sequence, after an appropriate rotation,is the concatenation of a monotonically increasing sub-sequence with amonotonically decreasing sub-sequence.

[0425] Definition D5: “unimodal-circular nonblocking” switch and“circular-unimodal nonblocking” switch. An N×N switch is said to be“unimodal-circular nonblocking” or “UC nonblocking” if it canaccommodate every complete matching between all input addresses and alloutput addresses, subject to the following constraint: under thematching, the linear input address is a circular unimodal function ofthe linear output address. This constraint is referred to as the“UC-nonblocking constraint”.

[0426] An N×N switch is said to be “circular-unimodal nonblocking” or“CU nonblocking” if it can accommodate every complete matching betweenall input addresses and all output addresses, subject to the followingconstraint: under the matching, the linear output address is a circularunimodal function of the linear input address. This constraint isreferred to as the “CU-nonblocking constraint”.

[0427] A complete matching between all input addresses and all outputaddresses means a combination of N concurrent point-to-pointconnections. The first letter in either “UC nonblocking” or “CUnonblocking” refers to the input side, and the second letter to theoutput side. Thus, “UC” stands for bending the output address range intoa discretized circle, on which the correspondence with input addressesdefines a unimodal function. Symmetrically, “CU” stands for bending theinput address range into a discretized circle, on which thecorrespondence with output addresses defines a unimodal function.

EXAMPLE 9

[0428] Every nonblocking switch is automatically UC nonblocking and CUnonblocking. The switching cell is a 2×2 example.

EXAMPLE 10

[0429] A 4×4 switch qualifies as a UC nonblocking switch if and only ifit accommodates at least the sixteen combinations of concurrentpoint-to-point connections depicted by element 3600 of FIGS. 37A-P.

EXAMPLE 11

[0430]FIG. 38A shows an exemplifying I/O matching (3810) from 10 inputports to 10 output ports which is compliant to the UC-nonblockingconstraint and thus can be accommodated by a 10×10 UC nonblockingswitch. Bending the output address range into a discretized circle 3811of length 10 and going along the circle from 0 to 9, the correspondinginput addresses are 4, 1, 0, 2, 3, 5, 6, 8, 9, 7. As indicated by thecurve 3812 this sequence defines a unimodal function over Z₁₀ with theonly local maximum “9” and the only local minimum “0”. Thus the sequencedefines a circular unimodal function. Equivalently, the same sequencecan be rotated into 0, 2, 3, 5, 6, 8, 9, 7, 4, 1 and becomes theconcatenation of the monotonically increasing sub-sequence “0, 2, 3, 5,6, 8, 9” and the monotonically decreasing sub-sequence “7, 4, 1”. Notethat, in the partition into monotonically increasing and decreasingsub-sequences, the maximum and minimum can go to either side. Forexample, the partition can also be “2, 3, 5, 6, 8, 9” and “7, 4, 1, 0”.Similarly, FIG. 38B shows an exemplifying I/O matching (3820) from 10input ports to 10 output ports which is compliant to the CU-nonblockingconstraint and thus can be accommodated by a 10×10 CU nonblockingswitch.

[0431] 5. Circular expander

[0432] Definition D6: “circular expander”. Label both input ports andoutput ports of an N×N switch by 0, 1, . . . , N−1. The switch is calleda “circular expander switch”, or simply “circular expander”, if it canaccommodate every combination of concurrent connections, point-to-pointor multicast, subject to the following constraint: if the input ports jand k are connected to the output ports p and q, respectively, then||j−k||_(N)≦|p−q|, where ||j−k||_(N)=min {|j−k|, N−|j−k|} is thedistance between j and k on the discrete circle Z_(N). This constraintis referred to as the “circular-expander constraint”.

EXAMPLE 12

[0433] The expander cell is a 2×2 circular expander.

[0434] A UC nonblocking (resp. CU nonblocking) switch is both acompressor (resp. decompressor) and upturned compressor (resp. upturneddecompressor). A circular expander is an expander, upturned expander, CUnonblocking switch, decompressor, and upturned decompressor.

[0435] 6. Preservation of conditionally nonblocking properties by 2X orX2 interconnection

[0436] When every node in a 2X interconnection network is filled by acompressor, the network constructs a compressor. That is, 2Xinterconnection preserves the compressor property of a switch.Recursively, a large compressor can be built by the recursiveapplication of 2X interconnection with each building block filled by asmaller compressor.

[0437] When every node in a 2X interconnection network is filled by anupturned compressor, the network constructs an upturned compressor. Thatis, 2X interconnection preserves the upturned compressor property of aswitch. Recursively, a large upturned compressor can be built by therecursive application of 2X interconnection with each building blockfilled by a smaller upturned compressor.

[0438] When every node in a 2X interconnection network is filled by a UCnonblocking switch, the network constructs a UC nonblocking switch. Thatis, 2X interconnection preserves the UC nonblocking property of aswitch. Recursively, a large UC nonblocking switch can be built by therecursive application of 2X interconnection with each building blockfilled by a smaller UC nonblocking switch.

[0439] When every node in an X2 interconnection network is filled by adecompressor, the network constructs a decompressor. That is, X2interconnection preserves the decompressor property of a switch.Recursively, a large decompressor can be built by the recursiveapplication of X interconnection with each building block filled by asmaller decompressor.

[0440] When every node in an X2 interconnection network is filled by anupturned decompressor, the network constructs an upturned decompressor.That is, X2 interconnection preserves the upturned decompressor propertyof a switch. Recursively, a large upturned decompressor can be built bythe recursive application of X2 interconnection with each building blockfilled by a smaller upturned decompressor.

[0441] When every node in an X2 interconnection network is filled by aCU nonblocking switch, the network constructs a CU nonblocking switch.That is, X2 interconnection preserves the CU nonblocking property of aswitch. Recursively, a large CU nonblocking switch can be built by therecursive application of X2 interconnection with each building blockfilled by a smaller CU nonblocking switch.

[0442] When every node in an X2 interconnection network is filled by anexpander, the network constructs an expander. That is, X2interconnection preserves the expander property of a switch.Recursively, a large expander can be built by the recursive applicationof X2 interconnection with each building block filled by a smallerexpander.

[0443] When every node in an X2 interconnection network is filled by anupturned expander, the network constructs an upturned expander. That is,X2 interconnection preserves the upturned expander property of a switch.Recursively, a large upturned expander can be built by the recursiveapplication of X2 interconnection with each building block filled by asmaller upturned expander.

[0444] When every node in an X2 interconnection network is filled by acircular expander, the network constructs a circular expander. That is,X2 interconnection preserves the circular expander property of a switch.Recursively, a large circular expander can be built by the recursiveapplication of X2 interconnection with each building block filled by asmaller circular expander.

[0445] The relationship among switch attributes that are preserved under2X or X2 interconnection is depicted by diagram 3900 of FIG. 38.

EXAMPLE 13

[0446] Consider a 15×15 compressor 4000 constructed from the 2X versionof 2Stg(3,5), as shown in FIG. 39, by filling in the nodes with anycompressors (4001, 4002, 4003, 4004, 4005, 4006, 4007, 4008) ofappropriate sizes. Suppose seven concurrent connections are requestedbetween the array of external input ports and array of external outputports (4009, 4010): a: 0 → 13 b: 1 → 14 c: 2 → 0 d: 7 → 1 e: 8 → 2 f: 11→ 3 g: 12 → 4

[0447] The combination of these seven connections is clearly compliantto the compressor constraint and thus must be accommodated by the 15×15compressor so constructed. To shed some light on why this is true, onecan examine the requested connections imposed on each individual nodelocally by the global connections. For example, the global connection0→13 imposes the connection 0→1 on the first input node and also theconnection 0→4 on the second output node. Thus, for example, threeconnections are requested on the first input node: 0→1, 1→2, 2→0; onecan easily find the combination of these three connections compliant tothe compressor constraint and thus can be accommodated by the compressorfilling the first input node.

[0448] As a conclusion, 2X interconnection preserves the compressor,upturned compressor, and UC nonblocking properties of a switch, while X2interconnection preserves the decompressor, upturned decompressor, CUnonblocking, expander, upturned expander, and circular expanderproperties of a switch. The same preservation holds when 2X or X2interconnection is recursively invoked. In particular, recursive 2X andX2 constructions from cells lead to indefinitely large conditionallynonblocking switches of the aforementioned nine types.

EXAMPLE 14

[0449] A special case in preserving the conditionally nonblockingproperties is when all the nodes in the network are 2×2 and filled withswitching cells. A switching cell is a nonblocking switch (which is alsoa UC nonblocking switch, CU nonblocking switch, compressor, upturnedcompressor, decompressor, and upturned decompressor). From switchingcells, a recursive 2X (resp. X2) construction realizes a UC nonblockingswitch (resp. CU nonblocking switch), which is also a compressor andupturned compressor (resp. a decompressor and upturned decompressor).

EXAMPLE 15

[0450] Another case is when all the nodes in the network are 2×2 andfilled with expander cells. An expander cell is a 2×2 “nonblockingswitch in the multicast sense”, i.e., it accommodates every combinationof connections without any constraint. It is in particular a circularexpander. From expander cells, a recursive X2 construction realizes acircular expander, which is also an expander, upturned expander, CUnonblocking switch, decompressor, and upturned decompressor.

[0451] 7. Construction of conditionally nonblocking switches

[0452] Alluded to above, the recursive 2X interconnection network ofcells preserves the compressor, upturned compressor and UC nonblockingproperties of a switch. Recall from section C5 that every recursive 2Xinterconnection network of cells is a banyan-type network withmonotonically decreasing trace and guide. In general, any banyan-typenetwork with both of its trace and guide being monotonically decreasingwill preserve the same properties. In fact, the following statements areequivalent for a banyan-type network:

[0453] Both the trace and the guide are monotonically decreasing.

[0454] The network constructs a UC nonblocking switch out of theswitching cells.

[0455] The network constructs a compressor out of switching cells.

[0456] The network constructs an upturned compressor out of switchingcells.

[0457] Analogously the recursive X2 interconnection network of cellspreserves the decompressor, upturned decompressor, CU nonblocking,expander, upturned expander, and circular expander properties of aswitch, and every recursive X2 interconnection network of cells is abanyan-type network with monotonically increasing trace and guide. Ingeneral, any banyan-type network with both of its trace and guide beingmonotonically increasing will preserve the same properties. In fact, thefollowing statements are equivalent for a banyan-type network:

[0458] Both the trace and the guide are monotonically increasing.

[0459] The network constructs a CU nonblocking switch out of theswitching cells.

[0460] The network constructs a decompressor out of switching cells.

[0461] The network constructs an upturned decompressor out of switchingcells.

[0462] The network constructs a circular expander out of expander cells.

[0463] The network constructs an expander out of expander cells.

[0464] The network constructs an upturned expander out of expandercells.

[0465] In conclusion, each of the aforementioned nine conditionallynonblocking properties of a switch are preserved by two families ofnetworks:

[0466] either recursive 2X or recursive X2 constructions with arbitrarysizes of building block, and

[0467] banyan-type networks either with both trace and guide beingmonotonically decreasing or with both trace and guide beingmonotonically increasing.

[0468] The relationship between the two families is summarized bydiagram 4100 and 4110, respectively, in FIG. 41.

[0469] 8. Realization of conditionally nonblocking switches by anarbitrary banyan-type network with appropriate I/O exchanges

[0470] In section C9 it is stated that when a 2^(n)×2^(n) banyan-typenetwork with the trace induced by a permutation τ and the guide by apermutation γ is prepended by an additional input exchange X_(λ) andappended by an additional output exchange X_(π), where γ=τ′⁻¹τ andπ=γ⁻¹γ′, the trace becomes induced by the permutation τ and the guide bythe permutation γ′. In view of the constructions in section D7, thismethod of altering the trace and guide is of particular interest whenτ′=σ

^((n))=γ′, that is, the new trace and guide are both monotonicallydecreasing sequences, or when τ′=id=γ′, that is, the new trace and guideare both monotonically increasing sequences.

[0471] Thus let the trace of an arbitrarily given banyan-type network[σ₀:σ₁: . . . : σ_(n−1):σ_(n)] be the sequence τ(1), τ(2), . . . , τ(n)and the guide be γ(1), γ(2), . . . , γ(n). Then, the banyan-type network[λσ₀:σ₁: . . . :σ_(n−1):σ_(n)π] has monotonically decreasing trace andguide, where λ=σ

^((n))τ and π=γ⁻¹σ

^((n)). The difference between the two networks is the prepending of theadditional input exchange X_(λ) and the appending of the additionaloutput exchange X_(π). Similarly, the banyan-type network [λσ₀:σ₁: . . .:σ_(n−1):σ_(n)π] has monotonically increasing trace and guide, where λ=τand π=γ⁻¹.

[0472] Different banyan-type networks may be functionally equivalent andcan substitute each other in applications. Among all banyan-typenetworks, those with the minimum layout complexity according to the“2-layer Manhattan model with reserved layers” turn out to be“divide-and-conquer networks”, as disclosed by S.-Y. R. Li, “Optimalmulti-stage interconnection by divide-and-conquer networks,” Proceedingsof the IASTED International Conference on Parallel and DistributedComputing and Networks, Brisbane, Australia, published by ACTA Press,Anaheim, Calif., pp. 318-323, 1998.

[0473] On the other hand, well-known banyan-type networks, such as thebaseline network and the banyan network, all have anti-optimal layoutcomplexities in some sense. Moreover, divide-and-conquer networks arenoted for their utmost structural modularity.

[0474] When a 2^(n)×2^(n) divide-and-conquer network is appended withthe swap exchange, the trace and guide are both monotonicallydecreasing. In fact, this network attains the minimum layout complexityamong all 2^(n)×2^(n) banyan-type networks with monotonically decreasingtrace and guide.

[0475] Similarly when a 2^(n)×2^(n) divide-and-conquer network isprepended with the swap exchange, the trace and guide are bothmonotonically increasing. In fact, this network attains the minimumlayout complexity among all 2^(n)×2^(n) banyan-type networks withmonotonically increasing trace and guide.

EXAMPLE 16

[0476]FIG. 42 depicts a recursive 2X interconnection network of cells,which is the 16×16 reverse banyan network (4201) appended with theinverse shuffle exchange (4202). With monotonically decreasing trace andguide, this network realizes a compressor when every cell in it isfilled with a switching cell. The same applies to the 16×16divide-and-conquer network (4301) appended with the swap exchange(4302), which appears in FIG. 43. Both networks are functionallyidentical, but the latter enjoys superior layout complexity andstructural modularity.

[0477] E. EQUIVALENCE AMONG BIT-PERMUTING NETWORKS UNDER INTRA-STAGECELL REARRANGEMENT

[0478] Consider that every interconnection line inside a multi-stagenetwork is an elastic string with one end affixed to an output of a nodeat one stage and the other end to an input of a node at the next stage.Let the ordering among nodes (e.g., cells) at a certain stage in thenetwork be scrambled, but keep the elastic strings attached to the saidoutput/input of nodes. An example is shown in FIG. 44A wherein stage 2(44011) is to be scrambled; the results of scrambling are shown in FIG.44B—for example, a node designated as node A in FIG. 44A, appearing asthe node second from the top in stage 44011, is moved to the nodeappearing as the third from the top in FIG. 44B. Thus the exchangesimmediately before and after the scrambled stage are altered. In fact,the exchange (44012) immediately before the scrambled stage getsmultiplied by an “exchange of rearrangement” (44021) from the right-handside and, meanwhile, the exchange (44013) immediately after thescrambled stage gets multiplied by the inverse (44022) of the “exchangeof rearrangement” from the left-hand side. More details pertaining toFIGS. 44A and 44B will be covered in a later example.

[0479] Since the internal connectivity of the network is not altered bythe scrambling, the networks before and after the scrambling areregarded as “equivalent”. This section describes the conditions for suchequivalence among bit-permuting networks and also present the mechanismfor the conversion between equivalent networks.

[0480] 1. Cell rearrangement

[0481] Over a 2^(n)×2^(n) bit-permuting network, it is of particularinterest when the scrambling of cell ordering within a stage results inanother bit-permuting network. This would be the case when theaforementioned “exchange of rearrangement” is a permutation inducedexchange, say, X_(κ). However, not every exchange induced by apermutation on integers 1 to n can play the role of this “exchange ofrearrangement”. The scrambling is among the 2^(n−1) cells at the stagebut does not scramble the ordering between the two inputs (resp. betweenthe two outputs) of each cell. If X_(κ)(a₁a₂ . . . a_(n−1)x)=b₁b₂ . . .b_(n−1)y for any bits x and y, it implies that the cell at the binaryaddress a₁a₂. . . a_(n−1) is relocated to the new address b₁b₂ . . .b_(n−1) and consequently X_(κ)(a₁a₂ . . . a_(n−1)0)=b₁b₂ . . . b_(n−1)0and X_(κ)(a₁a₂ . . . a_(n−1)1)=b₁b₂ . . . b_(n−1)1. For the permutationκ to possess this property, the equivalent condition is that κ(n)=n,that is, κ is actually a permutation on just the integers 1 to n−1. Thisobservation leads to the following formal definition.

[0482] Definition E1: “cell rearrangement”. If κ is permutation on theintegers from 1 to n but preserves n, then the induced 2^(n)×2^(n)exchange X_(κ) is called a 2^(n)×2^(n) “cell rearrangement”. Theapplication of the cell rearrangement X_(κ) to a particular stage of abit-permuting network means the multiplication of the exchangeimmediately before the stage by X_(κ) from the right-hand side togetherwith the multiplication of the exchange immediately after the stage byX_(κ) ⁻¹ from the left-hand side.

[0483] Explicitly, the application of the cell rearrangement X_(κ) tostage j of the 2^(n)×2^(n) k-stage network [σ₀:σ₁:σ₂: . . .:σ_(k−1):σ_(k)]_(n) results in the network [σ₀:σ₁: . . . :σ_(j−1)κ:κ⁻¹σ_(j): . . . :σ_(k)]_(n). Let κ₁, κ₂, . . . , κ_(k) bepermutations on integers from 1 to n that preserve n. Then theapplication of the 2^(n)×2^(n) cell rearrangement induced by each κ_(j)to Stage j, respectively, of the 2^(n)×2^(n) k-stage network [σ₀:σ₁:σ₂:. . . :σ_(k−1):σ_(k)]_(n) results in the network [σ₀κ₁:κ₁ ⁻¹σ₁κ₂:κ₂⁻¹σ₂κ₃: . . . :κ_(k−1) ⁻¹σ_(k−1)κ_(k):κ_(k) ⁻¹σ_(k)]_(n).

[0484] A cell rearrangement on any stage of a bit-permuting network[σ₀:σ₁: σ₂: . . . :σ_(k−1):σ_(k)]_(n) preserves both the trace and guideof the network.

EXAMPLE 1

[0485] FIGS. 44A-C exemplify the application of the cell rearrangementX₍₃ ₂ ₁₎ on stage 2 (44011) of the 16×16 baseline network [id:(1 2 34):(2 3 4):(3 4):id] 44010 of FIG. 44A; network 44020 of FIG. 44B is therearranged network before simplifying the pictorial display of theexchanges. The cell rearrangement relocates a stage-2 cell from thegeneric address binary(b₁b₂b₃) to the new address binary(b₂b₃b₁). Inother words, the exchange X₍₁ ₂ ₃ ₄₎ (44012) of FIG. 44A immediatelybefore stage 2 is multiplied by X₍₃ ₂ ₁₎ (44021) of FIG. 44B from theright-hand side to yield the resulting exchange X₍₃ ₄₎ (44031) of FIG.G1C, while the exchange X₍₂ ₃ ₄₎ (44013) of FIG. 44A immediately afterstage 2 is multiplied by X₍₂ ₃ ₄₎ (44022) of FIG. 44B, i.e., the inverseof X₍₃ ₂ ₁₎, from the left-hand side to yield the resulting exchange X₍₄₂₎₍₃ ₁₎ (44032) of FIG. 44C. The cell rearrangement results the network44030 having a simplified graphical representation:

[id:(1 2 3 4)(3 2 1):(1 2 3)(2 3 4):(3 4):id]=[id:(4 3):(4 2)(3 1):(43):id]

[0486] 2. Equivalence among banyan-type networks under cellrearrangement

[0487] Every given 2^(n)×2^(n) banyan-type network can becell-rearranged into any other except possibly for the mismatch of I/Oexchanges, and there is only a unique way for such cell rearrangement.More explicitly, given the banyan-type networks Φ=[σ₀:σ₁: σ₂: . . .:σ_(n−1):σ_(n)] and Ψ=[π₀:π₁:π₂: . . . :π_(n−1):π_(n)], there exists aunique sequence κ₁, κ₂, . . . , κ_(n) in of permutations on integersfrom 1 to n that preserve n such that the application of the cellrearrangement induced by each κ_(j) to stage j, respectively, of thenetwork Φ results in a network Ψ′ in the form of [α:π₁:π₂: . . .:π_(n−1):β] for some permutations α and β. As noted in the above, cellrearrangement preserves trace and guide and hence the networkΨ′=[α:π₁:π₂: . . . :π_(n−1):β] shares the same trace and guide with thenetwork Φ. From the definition of trace, the two networks Ψ and Ψ′ sharea common trace if and only if α=π₀ and share a common guide if and onlyif β=π_(n). Thus, the two given networks Φ and Ψ share a common trace ifand only if α=π₀, which is also a necessary and sufficient condition forcell-rearranging Φ into a network that is identical with Ψ exceptpossibly for a different output exchange. Similarly, the two givennetworks share a common guide if and only if β=π_(n), which is also anecessary and sufficient condition for cell-rearranging Φ into a networkthat is identical with Ψ except possibly for a different input exchange.

[0488] Since cell rearrangement does not alter the internal connectivityof a multi-stage network, the networks before and after therearrangement are regarded as “equivalent” to each other and areexchangeable in applications. Thus two 2^(n)×2^(n) banyan-type networksare “equivalent” if and only if they share the same trace and guide.However, this is only the strong sense of “equivalence”. There are someweaker senses of the meaning of network “equivalence” through cellrearrangement. For certain applications, the input exchange and/or theoutput exchange is immaterial and hence two given networks are regardedas “equivalent” to each other when one of the given networks can becell-rearranged into a form that matches all interstage exchanges in theother given network but without necessarily matching the input exchangeand/or the output exchange. Thus, there are four senses of network“equivalence” through cell rearrangement depending on whether or not torequire the matching of the input exchange and whether or not to requirethe matching of the output exchange.

[0489] Two banyan-type networks are said to be “equivalent” to eachother in the weak sense when one of them can be cell-rearranged into anetwork that matches all interstage exchanges of the other. All2^(n)×2^(n) banyan-type networks are equivalent under this weak sense.One intermediate sense of equivalence between two networks is when oneof them can be cell-rearranged into a network that matches the inputexchange, as well as all interstage exchanges, of the other. Thenecessary and sufficient condition for the equivalence in this sense isthe sharing of a common trace. Another intermediate sense of equivalencebetween two networks is when one of them can be cell-rearranged into anetwork that matches the output exchange, as well as all interstageexchanges, of the other. The necessary and sufficient condition for theequivalence in this sense is the sharing of a common guide. These foursenses of equivalence among banyan-type networks are arranged into ahierarchical diagram 4500 in FIG. 45.

[0490] The equivalence among banyan-type networks without I/O exchangesis worth extra mentioning. Let two banyan-type networks Φ=[id:σ₁:σ₂: . .. :σ_(n−1):id] and Ψ=[id:π₁:π₂: . . . :π_(n−1):id] be given. There is aunique way of cell-rearranging the network Φ into the form of [α:π₁:π₂:. . . :π_(n−1):β] for some permutations α and β. This unique way of cellrearrangement leaves the first stage intact if and only if α=id, whichis equivalent to the sharing of a common trace between the two givennetworks. Similarly, the unique way of cell rearrangement leaves thefinal stage intact if and only if β=id, which is equivalent to thesharing of a common guide between the two given networks. The foursenses of equivalence among banyan-type networks without I/O exchangesare arranged into a hierarchical diagram 4600 as shown in FIG. 46.

EXAMPLE 2

[0491] Suppose that a chip implements a decompressor with a recursive X2construction together with the circuitry for preprocessing the inputtraffic to ensure the compliance with the decompressor constraint. Thisconstruction can be replaced by some other banyan-type networks, as longas the decompressor property is preserved. Since the connections to thecircuitry for input preprocessing fix the external input order of thenetwork, the new network needs to share the same trace as the originalnetwork. On the other hand, since the external output order can bealtered outside the chip or relabeled in order to preserve thedecompressor property, it is not necessary for the new network to sharethe same guide as the original network.

[0492] 3. Equivalence among bit-permuting networks under cellrearrangement

[0493] The four senses of equivalence among banyan-type networks extendto all bit-permuting networks and are summarized into a hierarchicaldiagram 4700 in FIG. 47.

[0494] Two bit-permuting networks are equivalent to each other in thestrong sense when they can be cell-rearranged into each other. Thenecessary and sufficient condition is for the two networks to share thesame trace and the same guide.

[0495] One intermediate sense of equivalence between two networks iswhen one of them can be cell-rearranged into a network that matches theinput exchange, as well as all interstage exchanges, of the other. Thenecessary and sufficient condition for the equivalence in this sense isthe sharing of a common trace. When two 2^(n)×2^(n) bit-permutingnetworks are equivalent in this sense, there exists a permutation onintegers 1 to n that maps the guide of one network term-by-term to theguide of the other.

[0496] Another intermediate sense of equivalence between two networks iswhen one of them can be cell-rearranged into a network that matches theoutput exchange, as well as all interstage exchanges, of the other. Thenecessary and sufficient condition for the equivalence in this sense isthe sharing of a common guide. When two 2^(n)×2^(n) bit-permutingnetworks are equivalent in this sense, there exists a permutation onintegers 1 to n that maps the trace of one network term-by-term to thetrace of the other.

[0497] Two bit-permuting networks are equivalent to each other in theweak sense when one of them can be cell-rearranged into a network thatmatches all interstage exchanges of the other. Two k-stage 2^(n)×2^(n)bit-permuting networks are equivalent in this sense if and only if thereexist a permutation on integers 1 to n that maps the trace of onenetwork term-by-term to the trace of the other. This condition isequivalent to the existence of a permutation that maps the guide of onenetwork term-by-term to the guide of the other.

[0498] The four senses of equivalence among bit-permuting networkswithout I/O exchanges are summarized into a hierarchical diagram 4800 inFIG. 48.

[0499] Let the permutation σ on integers 1 to n map the trace of a2^(n)×2^(n) bit-permuting network term-by-term to the trace of another.By prepending the first network with the extra input exchange induced byσ⁻¹, the two networks become sharing a common trace. On the other hand,if π maps the guide of the first network term-by-term to the guide ofthe second, then appending the first network with the extra outputexchange X_(π) make the two networks share a common guide. If both theextra input exchange and the extra output exchange are applied, the twonetworks become sharing a common trace and a common guide. Thus theextra input exchange and/or the extra output exchange turn theequivalence in the weak sense into the equivalence in a stronger sense.

[0500] Examples of this technique have appeared in subsection F8 in theconversion of an arbitrarily given banyan-type network into one withmonotonically decreasing/increasing trace and guide in order to preservevarious conditionally nonblocking properties of a switch.

[0501] F. GENERALIZED DIVIDE-AND-CONQUER NETWORKS

[0502] 1. Recursive 2-stage construction associated with a binary tree

[0503] Recall the definitions in Section B of “2-stage interconnection”,“recursive 2-stage construction”, “2-stage tensor product”, etc. Thefollowing conventions are adopted throughout this section unlessotherwise specified:

[0504] The term “2-stage interconnection” includes plain 2-stageinterconnection, 2X interconnection, and X2 interconnection.Consequently, the terms of a “2-stage tensor product” would include thecase of a “2X tensor product”, etc.

[0505] All building blocks of all constructions are cells, i.e., 2×2nodes, hence the term “recursive 2-stage construction from cells” isabbreviated as “recursive 2-stage construction” in this section whenthere is no ambiguity.

[0506] All exchanges in the multi-stage interconnection networks arebit-permuting.

[0507] Recall from section B that a binary tree logs a procedure for“recursive applications of 2-stage interconnection” or “recursive2-stage construction” in short. The binary tree is then said to be“associated with” the recursive 2-stage interconnection network yieldedby the logged procedure. Paving the way for the description of certaininventive subject matter, this section provides further details in theassociation between binary trees and recursive 2-stage interconnectionnetworks. Some basic notions pertaining to a binary tree are listedbelow:

[0508] In a binary tree, “leaves” always outnumber “internal nodes” byexactly one. Thus there are exactly k−1 internal nodes on a k-leaf tree.

[0509] The “weight” of a node J is defined to be the number of leaves inthe sub-tree rooted at J.

[0510] When J is a leaf, the sub-tree rooted at J is a single node andhence the weight of a leaf is one.

[0511] A binary tree is said to be “balanced” if for every internalnode, the weights of its two sons differ from each other by at most one.

[0512] A binary tree is said to be “anti-balanced” if for every internalnode, at least one of its two son is a leaf. In particular, a “leftisttree” (resp. a “rightist tree”) means a binary tree where the right-son(resp. left-son) of every internal node is a leaf

EXAMPLE 1

[0513] FIGS. 49A-E show all five 4-leaf binary trees. The weight of eachinternal node is labeled on the node. Among the five trees 4910, 4920,4930, 4940 and 4950, the tree 4910 is the only balanced tree, the tree4920 is the rightist tree and the tree 4950 is the leftist tree.

[0514] The association between binary trees and recursive 2-stageinterconnection networks can be built from bottom up through thefollowing recursion:

[0515] A single-node binary tree is associated with the single-cellnetwork.

[0516] A multi-node binary tree is associated with the 2-stage tensorproduct of Φ and Ψ, where Φ and Ψ, respectively, are networks associatedwith sub-trees rooted at the left and right sons of the root node.

EXAMPLE 2

[0517] The recursive plain 2-stage interconnection network associatedwith the balanced tree 5010 of FIG. 50A is the 16×16 network [:(3 4):(13)(2 4):(3 4):] 5100 shown in FIG. 51, which will be called the 16×16“divide-and-conquer network” in a definition in the sequel. The oneassociated with the rightist tree 5020 of FIG. 50B is the 16×16 baselinenetwork [:(1 2 3 4):(2 3 4):(3 4):] 5200 shown in FIG. 52. Symmetricallythe one associated with the leftist tree 5050 of FIG. 50E is the 16×16reverse baseline network [:(4 3):(4 3 2):(4 3 2 1):], which is themirror image of the 16×16 baseline network 5200. If “2X interconnection”is used instead of “plain 2-stage interconnection”, the recursive2-stage interconnection network associated with the balanced tree 5010is the 16×16 network [:(3 4):(1 3 2 4):(3 4):(1 3 2 4)] 5300 shown inFIG. 53. Meanwhile, the one associated with the rightist tree 5020 isthe 16×16 baseline network appended with the swap exchange [:(1 2 34):(2 3 4):(3 4):(1 4)(2 3)] 5400 shown in FIG. 54, and the oneassociated with the leftist tree 5050 is the 16×16 reverse banyannetwork appended with the inverse shuffle exchange [:(3 4):(2 4):(14):(1 2 3 4)] 5500 shown in FIG. 55.

[0518] As a convention stated at the beginning of this section, buildingblocks of a recursive 2-stage interconnection network are cells. Eachleaf of the binary tree corresponds to a building block in the recursive2-stage interconnection network associated with the tree, while ageneric internal node J corresponds to the step of 2-stageinterconnection in the same recursive 2-stage construction, where eachinput node at that step is a network associated with the sub-tree rootedat the left son of J and each output node at that step is a networkassociated with the sub-tree rooted at the right son of J.

EXAMPLE 3

[0519] A node of a binary tree corresponds to a building block or a stepof 2-stage interconnection in the recursive construction of the networkassociated with the tree. The dimensions of a building block are 2×2,and the dimensions of the resulting network from each step of 2-stageinterconnection is 2^(k)×2^(k) for some k. In this way every node of abinary tree corresponds to the dimensions 2^(k)×2^(k) for some k. Forthe five 4-leaf binary trees 4910, 4920, 4930, 4940 and 4950 in FIGS.49A-49E, the corresponding dimensions of each node is indicated in FIGS.50A-50E, where the five trees 5010, 5020, 5030, 5040, and 5050 areidentical with those in FIGS. 49A-49E.

[0520] The association between binary trees and recursive 2-stageinterconnection networks can be summarized in general as follows: Therecursive plain 2-stage interconnection network associated with ann-leaf binary tree is a 2^(n)×2^(n) banyan-type network without I/Oexchange, that is, a network in the form [id:σ₁: . . . :σ_(n−1):id]_(n)or simply [:σ₁: . . . :σ_(n−1):]_(n).

[0521] In particular, the recursive plain 2-stage interconnectionnetwork associated with the n-leaf rightist (resp. leftist) tree is the2^(n)×2^(n) baseline network (resp. reverse baseline network).

[0522] The recursive 2X interconnection network associated with ann-leaf binary tree is a 2^(n)×2^(n) banyan-type network with an outputexchange and without an input exchange, that is, a network in the form[id:σ₁: . . . :σ_(n−1):σ_(n)]_(n) or simply [:σ₁: . . . :σ_(n−1):σ_(n)]_(n). In particular, the recursive 2X interconnection networkassociated with the n-leaf leftist tree is the 2^(n)×2^(n) reversebanyan network appended with the 2^(n)×2^(n) inverse shuffle exchange.

[0523] The recursive 2X interconnection network associated with then-leaf rightist tree is the 2^(n)×2^(n) baseline network appended withthe 2^(n)×2^(n) swap exchange.

[0524] The recursive X2 interconnection network associated with ann-leaf binary tree is a 2^(n)×2^(n) banyan-type network with an inputexchange and without an output exchange, that is, a network in the form[σ₀:σ₁: . . . :σ_(n−1):id]_(n) or simply [σ₀:σ₁: . . . : σ_(n−1):]_(n).

[0525] In particular, the recursive X2 interconnection networkassociated with the n-leaf leftist tree is the 2^(n)×2^(n) reversebaseline network prepended with the 2^(n)×2^(n) swap exchange.

[0526] The recursive X2 interconnection network associated with then-leaf rightist tree is the 2^(n)×2^(n) banyan network prepended withthe 2^(n)×2^(n) shuffle exchange.

[0527] 2. Divide-and-conquer network

[0528] Definition F 1: “divide-and-conquer network”. A 2^(n)×2^(n)“divide-and-conquer network” is the recursive plain 2-stageinterconnection network associated with an n-leaf balanced binary tree.In particular the 2×2 divide-and-conquer network is just a single cell.

EXAMPLE 4

[0529] The only two 3-leaf trees are the leftist and the rightist trees.Both are balanced and also anti-balanced. Thus the 8×8 reverse baselinenetwork is the divide-and-conquer network associated with the 3-leafleftist tree 5610 in FIG. 56A. The mirror image, i.e., the 8×8 baselinenetwork is the divide-and-conquer network associated with the 3-leafrightist tree.

EXAMPLE 5

[0530] Among the five 4-leaf trees shown in FIGS. 50A-50E, the onlybalanced tree is the tree 5010. The unique 16×16 divide-and-conquernetwork 5100, as shown in FIG. 51, is the recursive plain 2-stageinterconnection network associated with the 4-leaf balanced tree 5010.

EXAMPLE 6

[0531] Associated with the 6-leaf balanced binary tree 5630 in FIG. 56Cis the 64×64 divide-and-conquer network 5700 shown in FIG. 57. Themiddle exchange X₍₆ ₃₎₍₅ ₂₎₍₄ ₁₎ 5710 is the coordinate interchange inthe 2-stage interconnection with parameters m=8 and n=8. This exchangedivides the construction into two sides. There are eight disjoint copiesof the 8×8 reverse baseline network 5720 on each side, which is byitself a divide-and-conquer network. The middle exchange X₍₆ ₃₎₍₅ ₂₎₍₄₁₎ in this 64×64 network is equivalent to the array of contact pointsbetween two perpendicular stacks of planes 5801/5802 depicted by FIG.58. Each plane carries an 8×8 reverse baseline network 5720.

EXAMPLE 7

[0532] Associated with the 8-leaf balance tree 5640 in FIG. 56D is the256×256 divide-and-conquer network [:(8 7):(8 6)(7 5):(8 7):(8 4)(7 3)(62)(5 1):(8 7):(8 6)(7 5):(8 7):]. This network can be represented by twoorthogonal stacks in the same fashion as FIG. 58 but with every planecarrying a 16×16 divide-and-conquer network 5100 instead of an 8×8reverse baseline network. The network is divided by the middle exchangeX₍₈ ₄₎₍₇ ₃₎₍₆ ₂₎₍₅ ₁₎ into two sides, each containing 16 disjoint copiesof the 16×16 divide-and-conquer network. As mentioned in an earlierexample, this 16×16 network, in turn, is divided by its middle exchangeinto two sides, each containing four disjoint copies of the 4×4 network.The 4×4 network, in turn, is divided by its exchange into two sides withtwo cells on each side. The structure of the above 256×256 example ismost descriptive of the name “divide-and-conquer.”

EXAMPLE 8

[0533] According to the nature of a balanced tree, the weightdifferential between the two sons of every internal node is at most one.Thus, in the recursive 2-stage construction logged by a balanced tree,every step of 2-stage interconnection yields the tensor product betweena certain 2^(p)×2^(p) network and a certain 2^(q)×2^(q) network where|p−q|≦1. Thus p=┌n/2┐ and q=└n/2┘, or p=└n/2┘, and q=┌n/2┐, where thenotation ┌┐ stands for the arithmetic operation “ceiling” and └┘ forthe arithmetic operation “floor”. A 2^(n)×2^(n) divide-and-conquernetwork can therefore be recursively constructed as the plain 2-stagetensor product 5900 in FIG. 59 between a 2^(┌n/2┐)×2^(┌n/2┐)divide-and-conquer network 5901 and a 2^(┌n/2┐)×2^(┌n/2┐)divide-and-conquer network 5902.

[0534] A divide-and-conquer network achieves layout optimality under the2-layer Manhattan model with reserved layers, which has been the mostpopular layout model for CMOS technologies. Every 2^(n)×2^(n)divide-and-conquer network achieves optimal layout complexity among theclass of all 2^(n)×2^(n) banyan-type networks. In contrast, among allrecursive 2-stage interconnection networks of cells, those associatedwith anti-balanced trees, including both baseline and reverse baselinenetworks attain maximal layout complexity.

[0535] Besides layout optimality, another salient characteristic ofdivide-and-conquer networks is their modular structure. In the layeredimplementation as will be described in Section I, a generic componentsuch as an IC chips and or a printed circuit board implemented incorrespondence with a step of 2-stage interconnection of the recursiveconstruction can fill the roles of both the input node and the outputnode at the next step of 2-stage interconnection. This minimizes thenumber of different components required at each step of the recursiveconstruction.

[0536] 3. Generalize divide-and-conquer network

[0537] As mentioned in Section E, banyan-type networks are oftenexchangeable in applications. Some of them have been constructed fromintuition and appeared in the literature. However, except fordivide-and-conquer networks, they are all, in one sense or another,ranked among the least desirable choices based on the 2-layer Manhattanmodel. Therefore, in an application of any 2^(n)×2^(n) banyan-typenetwork without I/O exchanges, a 2^(n)×2^(n) divide-and-conquer networkcan always be deployed instead in order for the layout optimality andthe structural modularity. However, some particular applications ofbanyan-type networks may impose ad hoc constraints that are incompatiblewith divide-and-conquer networks. It is therefore desirable to identifya another class of networks with similar layout complexity andstructural modularity. A wider choice enhances the chance of includingone that meets the ad hoc requirements.

[0538] Recall from Section C that the interstage exchange in the plain2-stage interconnection with parameters 2^(n−r) and 2^(r) has beencalled the coordinate interchange. It is a bit-permuting exchange, andexplicitly, it is the r^(th) power of SHUF^((n)). On the other hand, anyother bit-permuting exchange can be used as long as it interconnectsevery input node with every output node, that is, routability isguaranteed. Therefore, a generalized 2-stage interconnection networkcomprising 2^(r) 2^(n−r)×2^(n−r) input nodes and 2^(n−r) 2^(r)×2^(r)output nodes is called a bit-permuting 2-stage interconnection networkwith parameter 2^(n−r) and 2^(r) if and only if the interstageinterconnection is in the pattern of a bit-permuting exchange induced bythe permutation σ on integers from 1 to n such that σ maps the numbersr+1, r+2, . . . , n into the set {1, 2, . . . ,n−r}.

[0539] Definition F2: “bit-permuting 2-stage tensor product”. Let Φ be a2^(n−r)×2^(n−r) (n−r)-stage network and Ψ a 2^(r)×2^(r) r-stage network.Fill the role of each input node in a bit-permuting 2-stageinterconnection network with parameter 2^(n−r) and 2^(r) with a copy ofΦ and each output node with Ψ. Ungroup nodes and lines inside every nodeso that they become elements directly belonging to the wholeconstruction. The result is an 2^(n)×2^(n) n-stage network, which iscalled the “bit-permuting 2-stage tensor product of Φ and Ψ”.

[0540] Definition F3: “recursive bit-permuting 2-stage construction” and“recursive bit-permuting 2-stage interconnection network”. The recursiveprocedure in forming bit-permuting 2-stage tensor products to constructa large multi-stage network is referred to as the “recursivebit-permuting 2-stage construction”; the network so constructed fromsingle-node networks is referred to as the “recursive bit-permuting2-stage interconnection network”.

[0541] Every recursive bit-permuting 2-stage interconnection network isroutable and in fact qualifies as a banyan-type network. Like therecursive 2-stage construction, every recursive bit-permuting 2-stageconstruction can be logged by a binary tree. The resulting recursivebit-permuting 2-stage interconnection network is then said to be“associated” with that binary tree. The recursive bit-permuting 2-stageinterconnection network associated with every n-leaf binary tree is a2^(n)×2^(n) banyan-type network without I/O exchanges.

[0542] Definition F4: “generalized divide-and-conquer network”. Ageneralized divide-and-conquer network is a recursive bit-permuting2-stage interconnection network associated with a balanced binary tree.

[0543] Let an n-leaf balanced binary tree, n>1, be given. Byinterchanging the positions between two sons of the root node ifnecessary, it may be assumed that the weight of the left-son of the rootnode is ┌n/2┐. A generalized 2^(n)×2^(n) divide-and-conquer networkassociated with this n-leaf balanced tree can be recursively constructedas a bit-permuting 2-stage tensor product between a generalized2^(┌n/2┐)×2^(┌n/2┐) divide-and-conquer network and a generalized2^(┌n/2┐)×2^(┌n/2┐) divide-and-conquer network.

[0544] Every 2^(n)×2^(n) generalized divide-and-conquer network achievesthe same layout complexity and structural modularity as a conventional2^(n)×2^(n) divide-and-conquer network. Therefore, every 2^(n)×2^(n)generalized divide-and-conquer network also achieves the optimal layoutcomplexity among all 2^(n)×2^(n) banyan-type networks.

[0545] The exchanges in the form of the r^(th) power of SHUF^((n)),where 0<r<n, form a 2-parametered family of bit-permuting exchanges. Inthe conventional recursive 2-stage construction, the interstageinterconnection exchange employed at all steps of 2-stageinterconnection belong to this family. The following definitionintroduces another 2-parametered family of bit-permuting exchanges.

[0546] Definition F5: “SWAP^((n,r))exchange”. Given integers n and r,1≦r<n, let σ^((n,r)) denote the permutation (1 n)(2 n−1)(3 n−2) . . . (rn−r+1) and SWAP^((n,r)) denote the induced 2^(n)×2^(n) exchange. Whenr=1 or n−1, the permutation σ^((n,r)) is simply (1 n) and hence theexchange SWAP^((n, r)) reduces to the banyan exchange BANY^((n)). On theother hand when r=└n/2┘ or ┌n/2┐, the permutation σ^((n, r)) coincideswith σ

^((n)) and hence the exchange SWAP^((n, r)) reduces to the swap exchangeSWAP^((n)).

[0547] Definition F6: “2-swap interconnection network”. The “2-swapinterconnection network” with parameter 2^(n−r) and 2^(r) is composed of2^(r) 2^(n−r)×2^(n−r) input nodes and 2^(n−r) 2^(r)×2^(r) output nodeswith the interstage interconnection in the pattern of the exchangeSWAP^((n,r)).

[0548] Definition F7: “2-swap tensor product”. Let Φ be a2^(n−r)×2^(n−r) (n−r)-stage network and Ψ a 2^(r)×2^(r) r-stage network.Fill the role of each input node in a 2-swap interconnection networkwith parameter 2^(n−r) and 2^(r) with a copy of Φ and each output nodewith Ψ. Ungroup nodes and lines inside every node so that they becomeelements directly belonging to the whole construction. The result is an2^(n)×2^(n) n-stage network, which is called the “2-swap tensor productof Φ and Ψ”.

[0549] Definition F8: “recursive 2-swap construction” and “recursive2-swap interconnection network”. In a recursive bit-permuting 2-stageconstruction, when the interstage exchange at each step of 2-stageinterconnection with parameter 2^(k−r) and 2^(r) is SWAP^((k,r)), theconstruction is called a “recursive 2-swap construction”. The resultingnetwork is called a “recursive 2-swap interconnection network”.

[0550] Let an n-leaf balanced binary tree, n>1, be given. Byinterchanging the positions between two sons of the root node ifnecessary, it may be assumed that the weight of the left-son of the rootnode is ┌n/2┐. A 2^(n)×2^(n) divide-swap-conquer network associated withthis n-leaf balanced tree can be recursively constructed as a 2-swaptensor product between a 2^(┌n/2┐)×2^(┌n/2┐) divide-swap-conquer networkand a 2^(└n/2┘)×2^(└n/2┘) divide-swap-conquer network.

EXAMPLE 9

[0551] The 2^(n)×2^(n) banyan network (resp. reverse banyan network) isthe recursive 2-swap interconnection network associated with the n-leafrightist tree (resp. leftist tree).

[0552] Definition F9: “divide-swap-conquer network”. Adivide-swap-conquer network is the recursive 2-swap interconnectionnetwork associated with a balanced binary tree. It is a special case ofa generalized divide-and-conquer network.

EXAMPLE 10

[0553] The 16×16 divide-swap-conquer network [:(3 4):(1 4)(2 3):(3 4):]is the network 6000 as shown in FIG. 60.

EXAMPLE 11

[0554] The 64×64 divide-swap-conquer network associated with the 6-leafbalanced binary tree 5630 in FIG. 56C is [:(5 6):(4 6):(1 6)(2 5)(34):(5 6):(4 6):] and appears as the network 6100 in FIG. 61. The middleexchange X₍₁ ₆₎₍₂ ₅₎₍₃ ₄₎ (6110) divides the network into two sides,each containing eight disjoint copies of the 8×8 reverse banyan network(6120).

[0555] The family of recursive bit-permuting 2-stage constructions isquite broad because of the wide choices for the interstage exchange ateach step of 2-stage interconnection. Divide-and-conquer, baseline, andreverse baseline networks belong to the subfamily of conventionalrecursive 2-stage constructions and are associated with balanced,rightist, and leftist trees, respectively. Their counterpart in theparallel subfamily of recursive 2-swap constructions aredivide-swap-conquer, banyan, and reverse banyan networks, which are alsowith balanced, rightist, and leftist trees, respectively.

[0556] G. SWITCHING CONTROL ASSOCIATED WITH A PARTIALLY ORDERED SET

[0557] Recall from Definition A3 that an m×n switch having an array of minput ports and an array of n output ports is defined by a set of atleast two different connection states from the input array to the outputarray such that the set of connection states ensures the connectivityfrom every input to every output. This abstract notion of a switchrefers to a switching fabric in unidirectional transmission and theconnection states in the definition map into those connectionconfigurations realizable by the switching fabric. This notion does notspecify the control of the selection, activation and transition of theconnection configurations of the switching fabric. Such controlmechanism employed by a switch is referred to as the “switchingcontrol”. Therefore, the specification of the switching controlcomplements the abstract notion of a switch.

[0558] Note that the switching control in general may cover the controlof other parts of a switch besides switching fabric, such as inputtraffic preprocessing, output multiplexing, admission control, and soforth, as well as other auxiliary functions in a switch. However, theswitching control in this context, without otherwise explicitspecification, refers to the control of a switch aimed at routing theincoming data units arrived at the input ports to their respectivedestined output ports by properly selecting, activating, setting, orchanging the connection configurations of the switching fabric.Therefore, it is also called the “routing control” of the switch. Thecircuitry in a switch responsible for the switching control is calledthe “switching control circuitry”, or “routing control circuitry”, oreven simply “control circuitry” when there is no ambiguity.

[0559] A data unit routed through a switch is loosely called a packet.An incoming data unit is sometimes interchangeably called an inputsignal or an input packet in the context.

[0560] 1. Centralized control vs. in-band control

[0561] The switching control can be in-band or out-of-band. A switchemploying out-of-band control is illustrated by FIG. 62A. The controlcircuitry (6201) of this kind of switch is usually referred to as thecentral control unit, and is separated from the main switching fabric(6202). The connection configurations of the switching fabric, orequivalently, the connection states of the switch, are controlled by thecontrol signals from this central control unit through the control inputports (6204), which are nondata input ports in addition to the array ofdata input ports (6205). When the switch is a switching network, thatis, an interconnection network of switching elements, as exemplified inFIG. 62B, each switching element (e.g. 6211) of the switching network(6210) is controlled by a control signal from the central control unit(6212) through a control input port (6213). Recall that a combination ofa connection state in each individual switching element determines aglobal connection state of the switch; thus by controlling eachswitching element, the overall switching control is achieved. Somepopular switching architectures, such as crossbar switch andshared-buffer-memory switch, normally adopt out-of-band control. Inresponse to the connection request, the central control unit (of aswitch employing centralized control) needs to possess global knowledgeof the status of the switch, including the addresses of the active I/Ocorresponding to the request, the existing connections establishedinside the switch, and the status of each of the switching elements inorder to make the appropriated route hunting/selection decision toaccommodate the request. Therefore, centralized control usually requireshigh processing and memory speeds and inevitably imposes a bottleneck onthe performance when the number of I/O is large. Hence centralizedcontrol is only suitable for a small number of I/O.

[0562] On the other hand, the control signal of a switch employingin-band control, called the “in-band control signal”, is carried alongwith each input packet. Typically, the in-band control signal is justone or a few bits prefixing the packet. FIG. 63A illustrates a switch(6300) of such type. Every input packet (6301) includes the in-bandcontrol signal (6302) followed by a payload (6303). The control signalsfrom all input packets together determine the connection state of theswitch. When an input port is idle, the input port will receive a signalof idle expression, e.g. a stream of bits “0”. Therefore, an inputpacket to a switch can be either a real data input signals or an idleexpression.

[0563] Switching architectures in the type of multi-stageinterconnection of switching elements is especially suitable for in-bandcontrol. For a switch realized from a multi-stage interconnectionnetwork of switching elements employing in-band control, as exemplifiedin FIG. 63B, the switching elements (6311, 6312, 6313, 6314) areinterconnected in such a way that when each switching element (e.g.6311) of the switching network (6310) determines its own connectionstate according to the control signals of the local input packets (6321,6322) arrived at its local data input ports (6331), the globalconnection state of the switch is thereby determined and incomingsignals can then be routed.

[0564] 2. Generic control of a switching cell

[0565] Recall from section A that a switching cell is a 2×2 switch whosetwo connection states are “Bar” and “Cross”. As shown in FIG. 2A, theBar state 201 refers to the connection state of concurrently connectinginput-0 to output-0 and input-1 to output-1. FIG. 2B shows the Crossstate 202 which is a connection state concurrently connecting input-0 tooutput-1 and input-1 to output-0. A switching cell in a switchingnetwork employing out-of-band control is depicted in FIG. 64A. Thecontrol signal to the switching cell (6401) is from the central controlunit (6402) through the control input port (6403), and in the simplestcase, a 1-bit signal is sufficient to control the two possibleconnection states. On the other hand, as shown in FIG. 64B, when thecontrol is by in-band signaling, the two control signals (6411, 6412),each being one or a few bits prefixing the data packet (6413, 6414),arrive at the two data input ports (6415, 6416) of the switching cell(6417) together determine the Bar/Cross state of the cell. As alluded toabove, distributed in-band control is preferred to centralizedout-of-band control, especially in the switching control of a massivebroadband switching network; therefore, the immediate focus of thiscontext is only on the in-band control.

[0566] All switching cells hereinafter are referring toin-band-controlled switching cells unless otherwise explicitlyspecified.

[0567] For point-to-point switching (the case of multicast switchingwill be described in the sub-section G6,) normally there are three typesof signals entering a switching cell: (1) data signals intended foroutput-0 of the cell, called “0-bound signals”, (2) data signalsintended for output-1 of the cell, called “1-bound signals”, and (3)idle expressions, also to be called “idle signals”. When two inputpackets are destined for the same output port, output contention occurs,and there exist many ways in the existing art to resolve outputcontention. All possible combinations of the two signals arrived at thetwo inputs of a switching cell and the corresponding connection statesare tabulated in Table 1. TABLE 1 Connection state of the switching cellSignal at Signal at input-1 input-0 “idle” “0-bound” “1-bound” “idle”Any Cross Bar “0-bound” Bar Contention for Bar output-0 “1-bound” CrossCross Contention for output-1

[0568]FIG. 65A presents the block diagram 6500 of a generic switchingcell under in-band control. A bit pipeline from each of the two datainputs (6501, 6502) enters one of the two shift registers (6503, 6504).The control signals from the two shift registers together determines thestate of the automata (6510) which in turn determines the connectionstate of the switching cell. The connection state is implemented withtwo 2×1 multiplexers (6505, 6506), one at each of the two outputs (6507,6508). A 2×1 multiplexer is a 2×1 two-state switch whose two connectionstates are ({0}, null) and (null, {0}), as respectively depicted inFIGS. 65B and 65C. The two input ports of a multiplexer receives the twobit pipelines, originated from input-0 and input-1, from both shiftregisters, but only one is routed to its single output, depending on theconnection state. When the automata enter the state “BAR” or “CROSS”, itsignals both multiplexers 6505 and 6506 through the two control channels6511 and 6512, respectively, to receive bits from the appropriate shiftregister. To implement the Bar connection state, the upper multiplexer6505 is set to receive from the upper shift register and the lowermultiplexer 6506 is set to receive from the lower shift register. On theother hand, the Cross connection state of the switching cell is achievedby setting each of the two multiplexers 6505 and 6506 to receive fromthe opposite shift register.

[0569] In-band-controlled switching cells are often deployed inside amulti-stage network, where signal synchronization is required not onlybetween the two in-band control signals to each individual cell but alsoacross the whole stage in the network. This ensures the synchronizedarrival of two signals at every cell at the next stage regardless of theinterstage exchange. The master clocking thus requires nondata input(s)to the cell. Through binary fan-outs, the master frame/bit clock signals(6511, 6512) are broadcast to all cells at the first stage and thenpropagated from one stage to another.

[0570] 3. Sorting cell associated with a partially ordered set

[0571] Definition G1: “partial order”. A “partial order” on a set Ω ofsymbols means a nonempty subset ρ of {(a, b): a∈Ω, b∈Ω, and a≠b},subject to the transitive law:

(a,b)∈ρ and (b,c)∈ρ

(a,c)∈ρ.

[0572] The set Ω is thus called a “partially ordered set” under ρ. Notethat a partially ordered set must contain at least two elements. A moreconventional notation for the statement of (a, b)∈ρ is a<b or simply a<bwhen there is no ambiguity. This reads as “a is smaller than b” or,equivalently, “b is greater than a.” The transitive law is thenrewritten in the more familiar form:

a<b and b<c

a<c.

[0573] Simply speaking, a partial order on a set of symbols specifiesthe ordering relationship, or simply “order”, among the symbols,although the ordering does not necessarily exist between every pair ofsymbols. Note that no symbol can be smaller than itself by definition.Moreover, if x<y, then y<x cannot hold. In fact, if x<y and y<x, thenthe transitive law implies x<x, which is a contradiction. The partialorder can be an artificial one. Even when the symbols are numbers, thepartial order does not have to be consistent with the natural order.

[0574] One special case of a partial order is a linear order definedbelow.

[0575] Definition G2: “linear order”. A partial order on a set Ω ofsymbols qualifies as a “linear order” when it abides by the trinity law:

a≠b

a<b or b<a

[0576] The set Ω in conjunction with the linear order is thus called an“ordered set”.

EXAMPLE 1

[0577] As mentioned in the above, the three types of signals entering aswitching cell are 0-bound, 1 -bound, or idle. Thus the set of signalvalues is {‘0-bound’, ‘idle’, ‘1-bound’}. An ideal switching cell forrouting these three types of signals is the one which always routes0-bound signals to output-0 and 1 -bound signals to output-1 wheneverthere is no output contention. To achieve this, one type of simplein-band control logic is for the switching cell to simply compare thetwo input values based on the following linear order defined on the setof the three symbols:

‘0-bound’<‘idle’<‘1-bound’,

[0578] and then route the signal of the smaller value to output-0 andthe one of the larger value to output-1. By this way, since a 0-boundsignal (resp. 1-bound signal) is the smallest (largest) among the threetypes of signals, it will always be routed to output-0 (output-1) unlessanother 0-bound signal (resp. 1-bound signal) competes with it, uponwhich the output contention occurs. The resulting connection state isidentical to the specification by Table 1.

EXAMPLE 2

[0579] A linear order defined on the set of symbols {00, 10, 11} doesnot necessarily have to be the natural order of 00<10<11. One legitimatelinear order is that 10<00<11. This awkward looking order is ofpractical usefulness, because, as to be explained in Example 4 in thesequel, the three values of a signal entering a switching cell is oftenencoded as:

‘0-bound’=10; ‘1-bound’=11; and ‘idle’=00

EXAMPLE 3

[0580] A partial order on the set of symbols {00, 01, 10, 11} is that

10<00<11 and 10<01<11,

[0581] which does not specify an ordering between 00 and 01. Thisexemplary order will be seen in the sequel for the routing control of anexpander cell.

[0582] In broadband applications, it is important to implement in-bandcontrol over a switching cell with very simple hardware so as to avoidanother source of bottleneck. Conceivably, one of the simplest types ofin-band control logic is for the switching cell to simply compare thetwo input values based on a predetermined ordering among all possiblevalues of an in-band control signal. Such a switching cell will becalled a “sorting cell” in the next definition.

[0583] Definition G3: “sorting cell”. Consider an in-band-controlledswitching cell where all possible values in an in-band control signalform a partially ordered set. This switching cell is called a “sortingcell associated with this partially ordered set” if it is under theswitching control such that the input signal switched to output-0 isnever greater than the one switched to output-1.

[0584] Definition G4: “0-1 sorting cell” and “routing cell”. The set {0,1} under the natural order of 0<1 forms the “0-1 ordered set”, and theassociated sorting cell is called the “0-1 sorting cell”. A “routingcell” is a sorting cell associated with the set {‘0-bound’, ‘idle’,‘1-bound’} under the linear order ‘0-bound’<‘idle’<‘1-bound’.

[0585] The correspondence between the input control signals and theconnection states is summarized in Table 2 for a 0-1 sorting cell, andin Table 3 for a routing cell. TABLE 2 Connection state Input-0 Input-1control Signal control Signal 0 1 0 Any Bar 1 Cross Any

[0586] TABLE 3 Input-1 control signal Connection state 0-bound idle1-bound Input-0 0-bound Any Bar Bar control Signal idle Cross Any Bar1-bound Cross Cross Any

EXAMPLE 4

[0587] A signal entering a switching cell is either a real data signalor an idle expression. An idle expression is naturally a stream of ‘0’bits. Thus every real data packet is prefixed by an activity bit ‘1’ inorder to differentiate from an idle expression. To perform theswitching, it is also important to distinguish between packets intendedfor output-0 from those intended for output-1. Thus the activity bit ‘1’is followed by the address bit, which indicates the preference betweenthe two outputs of the cell. The two bits together form the in-bandcontrol signal. Meanwhile, for an idle packet, the 2-bit in-band controlsignal is 00. Thus there are three possible values for an in-bandcontrol signal with the following coding:

‘0-bound’=10; ‘1-bound’=11; and ‘idle’=00

[0588] As mentioned in example 1, an ideal switching control is then toroute every 0-bound packet to output-0 and every 1-bound packet tooutput-1 whenever there is no output contention. This can be achievedwhen the switching cell is a routing cell. Its associated linear orderof ‘0-bound’<‘idle’<‘1-bound’ gives a real data packet the priority tochoose between the two outputs over an idle packet. Therefore, a routingcell can ideally implement the switching cell in the majority cases.

[0589] 4. Control of a routing cell

[0590] Recall that a sorting cell is a switching cell with special kindof in-band routing control—routing by sorting. Note that both the 0-1sorting cell and the routing cell are sorting cells, each associatedwith a special partially ordered set upon which the sorting is based on.The different partially ordered set the in-band-controlled switchingcell associated with leads to different implementation of the routingcontrol.

[0591] A simple switching control for a routing cell can be described bya finite-state automata with the three states “INITIAL”, “BAR” and“CROSS”. The automata state “BAR” (resp. “CROSS”) corresponds to the Bar(resp. Cross) connection state of the switching cell. The automata state“INITIAL” is associated with an arbitrary connection state. Initially,the switching cell is in an arbitrary connection state, and the automatastate is “INITIAL”. The prompt to the automata consists of the twoleading bits (00=‘idle’, 10=‘0-bound’, 11=‘1-bound’) from each of thetwo synchronous data inputs. These inputs generate a total of ninedifferent prompts.

[0592] When both input packets present 10 in the leading bits or bothpresent 11, output contention occurs. It can be arbitrated in variousways, e.g., by misrouting or blocking of one of the two packets. Whenboth control signals are idle expressions 00, the automata state can bearbitrarily changed or remain INITIAL. For the remaining six prompts,the two control signals differ from each other and hence one of them issmaller than the other according to the linear order of 10<00<11. Inreaction to the prompt the automata then enters a new state of “BAR” or“CROSS” and the connection state of the switching cell is latchedaccordingly. Subsequent bits then flow through the latched connectionstate of the cell.

[0593] An additional prompt to the automata is the frame clock from anondata input, which resets the automata to the state “INITIAL”. Table 4summarizes the automata action triggered by a prompt, but skips thedetail in the arbitration of output contention TABLE 4

[0594] The optimal circuitry of switching control over a sorting cell isusually tailored to the underlying partial order in the particularapplication. This often necessitates an elaborate automata with manymore detailed states than just three. The detailed state is representedby a number of registers, typically including one binary register forthe connection state. Often the switching control is implemented in away that absorbs one control bit at a time from each of the two inputsin order to simplify the logic for the computation of the connectionstate.

EXAMPLE 5

[0595] An exemplifying implementation of a routing cell by a 12-stateautomata is as follows. A state in the automata is represented by a pair(x, y). The x register is binary and represents the connection state: 0for Bar and 1 for Cross. It directly controls the two outputmultiplexers in the block diagram of FIG. 65A. The y register assumessix possible values:

“INITIAL”, “0&0”, “0&1”, “1&0”, “1&1”, and “LATCHED”

[0596] The initial y value is “INITIAL”. Upon the arrival of an activitybit from each data input, it becomes 0&0, 0&1, 1&0, or 1&1, reflectingthe obvious nomenclature of these states. Upon receiving the second bitfrom each input, the automata action includes the change of the y valueto “LATCHED” and the delivery of the two activity bits to the twooutputs through the latched connection state. Table 5 summarizes thestate transition, where the arbitration of output contention alwaysfavors input 0. (Given this bias, the two y values 1&0 and 1&1 can bemerged into one, unless the y value is needed in the regeneration of theactivity bit.)

[0597] Once the y value becomes “LATCHED”, bit pipelines from the twoinputs simply flow through the latched connection state. The effectiveprompt to the automata is then the frame clock signal to reset the yvalue to “INITIAL”. The only modification of a packet traversing thisrouting cell is the deletion of the second bit so that the third bitbecomes the new second bit. TABLE 5 Old State Prompt New State y x Input0 Input 1 y x Initial Any 0 0 0&0 Any 0 1 0&1 1 0 1&0 Any 1 1 1&1 O&0Any Any Any Latched Any 0&1 Any Any 0/1 Latched 1/0 1&0 Any 0/1 AnyLatched 0/1 1&1 Any 0/1 Any Latched 0/1 Latched 0/1 Any Any Latched 0/1

[0598] 5. Control of a 0-1 sorting cell

[0599] When control signals are k-bit, the sorting cell needs to absorb,say, k bits from each input before the connection state can be latchedso that the two bit streams can flow through. However, some of theinitial k bits in each stream may flow out before the latching of theconnection state. The next example illustrates an ideal situation wherethe sorting cell buffers only one bit of each input stream at a time.

EXAMPLE 6

[0600] Consider a sorting cell with the following characteristics:

[0601] The in-band control signal is a fixed length, say, k bits.

[0602] All the 2^(k) possible values are linearly ordered according tothe lexicographic binary value.

[0603] The sorting cell routes two synchronized packets without alteringtheir contents. Such a sorting cell can be implemented so that the twosynchronized input bit streams pipeline through the cell with only a1-bit delay: The sorting cell examines the two control signals bit bybit. The two bit streams are pipelined to the two outputs through anarbitrary connection state until the two signals start to differ, atwhich time the connection state is latched. All remaining bits then flowthrough the latched connection state. Note that although the sortingcell is associated with a linear order over the 2^(k) possible values(according to their lexicographic binary value), a simple sorting cellsimilar to the 0-1 sorting cell as defined in Definition G4 suffices forthe purpose since at each time, one bit from each input is compared.

EXAMPLE 7

[0604] The switching control of a 0-1 sorting cell may be implementedwith a 4-state automata. Two binary registers x and y represent theautomata state. The 0/1 value of x indicates the Bar/Cross connectionstate of the cell, respectively. It directly controls the two outputmultiplexers in the block diagram 6500 of FIG. 65A. The 0/1 value of yindicates the unlatched/latched status of the connection state,respectively. Initially, x is arbitrary and y=0. A control signal ispipelined bit by bit into the cell from each of the two data inputs. Thestate transition of the automata is summarized in Table 6. TABLE 6 OldState Prompt New State y x Input 0 Input 1 y x 0 0 0 0 0 Any 0 1 1 0 1 01 1 1 1 0 Any 1 0 0 0 Any 0 1 1 0 1 0 1 1 1 1 0 Any 1 0/1 Any Any 1 0/1

[0605] In a state with y=0, the prompt to the automata is a pair ofbits, one from each data input. If the two bits match, the x registerremains arbitrary and y remains 0. When the two bits differ, theconnection state x of the cell is set accordingly and latched; that is,the state becomes (0,1) or (1, 1). Whether or not the two bits differ,they are sent to the two outputs through the prevailing connection stateafter the automata action. When the y register becomes 1, the effectiveprompt to the automata is the frame clock signal to reset y to 0.Meanwhile, bit streams from the two inputs continue to progress throughthe latched connection state.

[0606] 6. Bicast cell

[0607] Definition G5: “bicast-0 and bicast-1 connection states”. The 2×2connection state that connects input-0 to both output-0 and output-1 iscalled the “bicast-0 connection state.” Similarly, the 2×2 connectionstate that connects input-1 to both output-0 and output-1 is called the“bicast-1 connection state.”

[0608] Recall that an “expander cell” is a 2×2 switch with the fourconnection states as shown in FIGS. 2C-F: bar (211), cross (212),bicast-0 (213), and bicast-1 (214). This terminology is independent ofthe switching control mechanism. Besides 0-bound, 1-bound, and idlepackets, another type of signals that enter an expander cell are thosedata signals intended for multicasting to both output-0 and output-1 ofthe cell. These are called “bicast signals”. Note that when one of thetwo input signals to an expander cell is a bicast signal, if the othersignal is an idle signal, of course the bicast signal will be routed toboth outputs; on the other hand, if the other signal is a unicastsignal, either 0-bound or 1-bound, it is fair to route the unicastsignal to its intended output port and the bicast signal to the otheroutput port; moreover, if the other signal is also a bicast signal, itis more fair to route each bicast signal to one of the two outputs thanto route one bicast signal to both outputs and block the other, so inthis case, the connection state of the expander cell should be eitherbar or cross, but not bicast-0 and bicast-1. Under this naturalassumption, all possible combinations of the two signals arrived at thetwo inputs of an expander cell and the corresponding connection statesare tabulated in Table 7. TABLE 7 Connection state of Signal at input-1the expander cell “idle” “0-bound” “1-bound” “bicast” Signal “idle” AnyCross Bar Bicast-1 at “0-bound” Bar Contention Bar Bar input-0 foroutput-0 “1-bound” Cross Cross Contention Cross for output-1 “bicast”Bicast-0 Cross Bar Bar/Cross

[0609] Definition G6: “bicast cell”. A “bicast cell” is an expander cellunder the following in-band-control. If one of the two inputs presents abicast packet and the other presents an idle packet, the bicast packetis “bicasted”, which means:

[0610] (1) a copy of the bicast packet is sent to each of the twooutputs through the bicast-0 or bicast-1 connection state;

[0611] (2) the copy received by output-0 assumes the status of a 0-boundpacket instead of a bicast packet, i.e., the control signal of the copyreceived by output-0 is set to be ‘0-bound’; and

[0612] (3) the copy received by output-1 assumes the status of a 1-boundpacket instead of a bicast packet, i.e., the control signal of the copyreceived by output-1 is set to be ‘1-bound’.

[0613] Else, the switching control is identical to that in a sortingcell associated with the partially ordered set {‘0-bound’, ‘1-bound’,‘idle’, ‘bicast’} under the partial order of ‘0-bound’<‘idle’<‘1-bound’and ‘0-bound’<‘bicast’<‘1-bound’.

[0614] In the text or drawing where ‘0-bound’, ‘1-bound’, ‘idle’,‘bicast’ are applicable, the symbols ‘0’, ‘1’, ‘I’ and ‘B’ respectivelyrepresent or symbolize 0-bound, 1-bound, idle and bicast packets, orcontrol signals corresponding to 0-bound, 1-bound, idle and bicast.

[0615]FIG. 65D shows the scenario when the two input packets at input-0(6560) and input-1 (6561) of a bicast cell (6551) are a bicast packet(6581) and an idle packet (6582), respectively. The connection state ofthe bicast cell is then set to be bicast-0 (6550). The bicast packet atinput-0 is then bicasted through this connection state, that is, thecontrol signals of the two copies of the bicast packet at output-0(6570) and output-1 (6571) are respectively set to be ‘0-bound’ and‘1-bound’. Similarly, FIG. 65E shows the scenario with an idle packet at0-input and a bicast packet at 1-input of the bicast cell. Theconnection state is then bicast-1 (6551), and the control signals atoutput-0 and output-1 are again respectively set to be ‘0-bound’ and‘1-bound’. Note that these are the only two cases in a bicast cellwherein the control signal of an input packet, actually, bicast packetonly, is changed when the packet is routed to the output. In otherwords, when a bicast packet arrives at a bicast cell, unless the packetat the other input is an idle packet, otherwise, exactly one copy of thebicast packet will be routed to one of the outputs of the cell, and itis still a bicast packet.

[0616] Just as when a routing cell is a switching cell under certainswitching control related to sorting, a bicast cell is an expander cellunder certain switching control related to sorting. If a genericexpander cell is regarded as the multicast counterpart of a genericswitching cell, then a bicast cell can be regarded as the multicastcounterpart of a routing cell.

[0617] The routing control of a bicast cell is similar to that of arouting cell, thus the block diagram 6500 for a generic switching cellcan be readily adapted for a generic expander cell, with the automata6510 having more states to correspond to the additional bicast-0 andbicast-1 connection states.

[0618] H. SELF-ROUTING CONTROL OVER A MULTI-STAGE SWITCHING NETWORK

[0619] Recall from the previous section, centralized control for aswitch is fast only when the number of I/O is small. Similarly when aswitching network is composed of a large number of switching nodes,centralized control over the network cannot be fast. Thereforein-band-controlled switching elements are often deployed inside amulti-stage network. An ideal style of distributed control over thenetwork is to leave the switching decision to each individual switchingelement, which selects a connection configuration purely by the in-bandcontrol signals to that element and independently of all otherconcurrent input signals in the network regardless the scale of thenetwork. Such control over the network appears as if the routing of eachindividual signal through the network is guided by the signal itself;the in-band control mechanism is sometimes referred to as “self-routing”in the literature.

[0620] The distributed nature of self-routing control thus enables fastswitching control over large-scale switching devices constructed frommassive interconnection networks of switching elements. Moreover, inbroadband applications, the in-band control signal to a switchingelement needs to be contained in as few bits as possible so that theswitching decision can be swiftly executed.

[0621] 1. Conventional self-routing over certain banyan-type networks

[0622] As alluded to in the Background Section, the concept of“self-routing” began with the in-band control mechanism for switchingcells in the Omega network (defined earlier); this control mechanism isfurther elaborated upon now as a prelude to the description inaccordance with the present invention.

[0623] Upon entering a 2^(n)×2^(n) Omega network (prepended with theshuffle exchange), a data packet composed of a sequence of bits isprepended with another sequence of bits which is its binary destinationaddress d₁d₂ . . . d_(n).

[0624] The bit d_(j) indicates the preference between the two outputs ofthe stage-j cell. The leading bit d₁ is the in-band control signal of adata packet to the stage-1 switching cell. A switching cell at any stagetakes the leading bit in each of its two input packets as the in-bandcontrol signal and selects its bar/cross connection state accordingly.In particular a stage-1 switching cell takes the leading bit d₁ in adata packet as the in-band control signal and consumes the bit d₁afterwards. Thus the leading bits in a data packet become d₂d_(3 . . .)d_(n) after exiting stage 1. A stage-2 switching cell takes the leadingbit d₂ in a data packet as the in-band control signal and consumes thebit d₂ afterwards. Thus the leading bits in a data packet become d₃d₄ .. . d_(n) after exiting stage 2. And so on.

[0625] This self-routing mechanism has also been applied to the banyannetwork prepended with the shuffle exchange. As to be explained shortlybelow the theoretical basis for this self-routing mechanism is actuallybased on the fact that the guide of the particular banyan-type networkis the monotonic sequence 1, 2, . . . , n. The same self-routingmechanism however does not apply to other banyan-type networks ingeneral. Like the baseline network, both the Omega network and thebanyan network are among those banyan-type networks well studied in theliterature. It is ironical that these widely studied networks are all inanti-optimal topology in one sense or another with regard to the layoutcomplexity under the 2-layer Manhattan model with reserved layers. Itwould be desirable to generalize the self-routing mechanisms to allbanyan-type networks, including those in the optimal topology.

[0626] 2. Inventive self-routing by the guide of a bit-permuting network

[0627] In accordance with the present invention, for a generic2^(n)×2^(n) banyan-type network with the guide γ(1), γ(2), . . . , γ(n),the self-routing mechanism can be generalized as follows. A packetdestined for the output address binary(d₁d₂. . . d_(n)) is prefixed withthe binary control stream d_(γ(1))d_(γ(2)) . . . d_(γ(n)), or1d_(γ(1))d_(γ(2)) . . . d_(γ(n)) if activity bit is present; eitherd_(γ(1))d_(γ(2)) . . . d_(γ(n)) or 1d_(γ(1))d_(γ(2)) . . . d_(γ(n)),depending upon the context, is called the “routing tag”. In thiscontext, the routing tag usually contains the activity bit. Thus theformat of the whole packet entering the switching network, assuming thepresence of the activity bit, is depicted by packet 6000 in FIG. 66A.

[0628] For each stage j, the in-band control signal used by the routingcontrol at that stage is a two-bit sequence comprising the activity bitand d_(γ(j)), the j-th bit of binary stream d_(γ(1))d_(γ(2)) . . .d_(γ(n)). Note that the in-band control signal changes from stage tostage but is conveniently derived from the initial routing tag.

[0629] Here a point should be noted that, if the routing tag remains thesame when entering each stage, the control circuitries at differentstages should then have different configurations in order to readdifferent bit positions of the routing tag to extract the stage-specificcontrol information, which is obviously undesirable. Therefore, a simplemechanism for manipulating the routing tag at each stage to facilitatethe extraction of the right control information from the tag isdescribed as follows: instead of being located at different positionsfrom stage to stage, the two-bit in-band control signal should be alwaysat the fixed position, say, the first two bits of the tag, such that thecontrol circuitry at each stage can always read the leading two bits ofthe routing tag to make the routing decision. To achieve this, when apacket reached the output port of a stage and before entering the nextstage, the second bit of the routing tag is shifted to the end of thetag, or just removed from the tag, by a simple dedicate 1×1 switchingcircuitry which is appended to every output port. In other words, eachstage here actually performs the routing of the packet and there-generation of the routing tag for the next stage. In this way, thefirst two bits are 1d_(γ(1)) when entering stage 1, and 1d_(γ(2)) whenentering stage 2, and so on, that is, the leading two bits of therouting tag of the packet entering each stage j are always 1d_(γ(j)),the right control signal required by the control circuitry of thatstage. As a consequence, the control circuitries can be identical at allstages.

[0630] When output contention occurs, one of the two packets intendedfor the same output may be deflected to the other output. However, insome applications, packet misrouting is more undesirable than blocking.In such cases, the switching cell simply blocks any intended 0-bound(resp. intended 1-bound) packet that has been deflected to output 1(resp. output 0). This can usually be implemented inside theaforementioned 1×1 switching circuitry as well.

[0631] Note that such a 1×1 switching circuitry can either be physicallyimplemented as a separated device appended to the main switching cell,as shown in FIG. 66C in the following example, or be a logical block indescription but physically implemented as integrated into the circuitryof the main switching cell, as shown in FIG. 67A, which is a blockdiagram of a switching cell including bit consumption and rotation.

[0632] Assuming the second approach of removing the second bit isadopted, FIG. 66B summarizes the format of a generic routing tag (6601)of a data packet entering stage j, and FIG. 66C illustrates how therouting tag is changed at various locations in a generic stage j. Whenthe routing tag 6610-1 has reached stage j, the segment d_(γ(1))d_(γ(2)). . . d_(γ(j−1)) has been consumed in the previous j−1 stages so thatonly the bits 1d_(γ(j))d_(γ(j+1)) . . . d_(γ(n)) remain in the tag. Thetwo leading bits (6611) are 1d_(γ(j)), and the switching control of thecell 6615 in stage-j reads just these two bits as the in-band controlsignal. Two identical aforementioned 1×1 switching circuits 6616 areappended at each of the two output ports of the cell 6615. When thepacket leaves the cell from one of its output ports, the routing tag6610-2 is still 1d_(γ(j))d_(γ(j+1)) . . . d_(γ(n)). Then it enters the1×1 switching circuitry 6616 attached at that output port, which removesthe second bit of the routing tag, so the routing tag 6610-3 at theoutput of 6616 becomes 1d_(γ(j+1)) . . . d_(γ(n)).

EXAMPLE 1

[0633] To demonstrate this generalized self-routing mechanism, considernetwork 2900 of FIG. 29. The destination address binary(d₁d₂d₃d₄) for apacket is 1110. The guide has been computed earlier as the sequence 2,4, 1, 3. Thus, d_(γ(1))=d₂=1, d_(γ(2))=d₄=0, d_(γ(3)) =d₁=1, andd_(γ(4))=d₃=1, so the data packet is prepended with the binary stream1d_(γ(1))d_(γY(2))d_(γ(3))d_(γ(4))=11011 as the routing tag. Each cellin the network is a sorting cell with respect to the linear order of

10(‘0-bound’)<00(‘idle’0)<11(‘1-bound’).

[0634] Recall that such a routing cell always routes 0-bound signal(with control bits 10) to output 0 and 1-bound signal (with control bits11) to coutput-1 when there is no output contention. Therefore, assumingno output contention occurs at each of the nodes along the path, uponentering the first stage at routing cell 2910, the two leading controlbits, namely, 11, are used to set the connection state of the cell 2910to “cross” in this case since the signal enters the routing cell fromits upper input, resulting in routing the packet to the lower output ofthe cell, that is, to the output address 1101 at that stage. Meanwhilethe second bit of the in-band control signal, namely 1, is consumed bythe appended 1×1 device (omitted in the drawing) and thus the newin-band control signal to the next stage becomes 10. Next, exchange X₍₃₄₎ leads the packet from the output address 1101 of stage 1 to the inputaddress 1110 of stage 2. Then the new in-band control signal, namely 10,is used to set the stage-2 cell 2920 to the “bar” state, resulting inrouting to output address 1110. Meanwhile the second bit of the in-bandcontrol signal, namely 0, is again consumed and thus the new in-bandcontrol signal to the next stage (stage 3) becomes 11. Next, exchangeX₍₁ ₄₎ leads the packet from the output address 1110 of stage 2 to theinput address 0111 of stage 3. Then the new 2-bit control sequence,namely 11, are used to set cell 2930 to the bar state, resulting inrouting the packet to the output address 0111. Then the second bit ofthe in-band control signal, namely 1, is again consumed before enteringstage 4. Finally, exchange X₍₂ ₄₎ leads the packet from the outputaddress 0111 of stage 3 to the input address 0111 of stage 4. Theremaining two control bits, namely 11, is used to set the cell 2940 tothe bar state, then the packet is routed to the output address 0111, andfinally led to its desired destination address 1110 through the outputexchange X₍₄ ₃ ₂ ₁₎.

[0635] Note that when idle expressions are disallowed in the system, thesimilar routing mechanism as shown in the above example can be usedwithout the activity bit in the routing tag. In that case, the in-bandcontrol signal to a generic stage-j cell is the single bit d_(γ(j)),which is also consumed by stage j.

[0636] The above self-routing mechanism can be extended to 2^(n)×2^(n)k-stage bit-permuting networks. Consider a generic 2^(n)×2^(n) k-stagebit-permuting network with the guide γ(1), γ(2), . . , γ(k), where y isa mapping from the set {1, 2, . . . , k} to the set {1, 2, . . . , n). Apacket destined for the binary output address d₁d₂ . . . d_(n) isinitially prefixed with the routing tag 1d_(γ(1))d_(γ(2)) . . .d_(γ(k)). The in-band control signal to a stage-j switching cell is1d_(γ(j)), and the second bit in this control signal is consumed atstage j. By induction on j, the in-band control signal is always infront of the packet upon entering any stage.

[0637] As already mentioned in the Background Section, and now wellunderstood because of the foregoing description, the main reason behindthe trial-and-error procedure of prior art was that such techniques hadnot had the benefit of a fundamental theoretical approach of determiningthe routing tag d_(γ(1))d_(γ(2)) . . . d_(γ(n)) or 1d_(γ(1))d_(γ(2)) . .. d_(γ(n)) from the guide of a bit-permuting network. The guide of theparticular 2^(n)×2^(n) networks studied in the prior art is thedestination address d₁d₂. . . d_(n) of a packet plus possibly anactivity up front. By happenstance, the general routing tagd_(γ(1))d_(γ(2)) . . . d_(γ(n)) coincides with the destination addressd₁d₂. . . d_(n) in the special case when the guide of a banyan-typenetwork is the monotonically increasing sequence (i.e., the sequence 1,2, . . . , n). As is now readily deduced, the destination address can beused as the routing tag only for those 2^(n)×2^(n) banyan-type networkswith monotonically increasing guide.

[0638] 3. Priority treatment

[0639] Let the guide of a 2^(n)×2^(n) banyan-type network be thesequence γ(1), γ(2), . . . , γ(n). Fill every node in the network with arouting cell adopting the coding scheme of

‘idle’=00; ‘0-bound’=10; ‘1-bound’=11

[0640] Thus the routing cell means a sorting cell with respect to thelinear order of 10<00<11. By adopting the self-routing mechanism asintroduced above, a packet with the binary destination addressd₁d_(2 . . .) d_(n) is preceded by the bit pattern 1d_(γ(1))d_(γ(2)) . .. d_(γ(n)) upon entering the switching network. At stage j, 1≦j≦n, thein-band control signal consists of the two leading bits, and the stageconsumes the bit d_(γ(j)). Thus the in-band control signal at stage j is1d_(γ(j)) for a real data packet and is 00 for an idle expression.

[0641] Now suppose that there are 2^(r) priority classes of 0-bound or1-bound packets. The priority class can be coded in an r-bit string p₁ .. . p_(r), and the coding for priority class may vary from one detaileddesign to another. To simplify the notation hereafter, r is assumed tobe 2 and smaller code values represent higher priority classes. One wayto blend the priority code p₁p₂ into the aforementioned self-routingscheme is as follows: Upon entering the switching network, a packet withthe destination address d₁d₂ . . . d_(n) is preceded by the bit pattern1d_(γ(j))p₁p₂d_(γ(j+1)) . . . d_(γ(n)) as illustrated by data packet6650 FIG. 66D. The generic routing cell in the network is now replacedby a sorting cell with respect to the linear order

1000<1000<1010<1011<0000<1111<1110<1101<1100

[0642] on the initial four bits of the packet. Moreover, the cellconsumes the second bit and rotates the third and fourth bits to theposition behind the fifth bit. Thus the initial four bits are1d_(γ(j))p₁p₂ upon entering each stage j, 1≦j≦n. Thus, the sorting cellis essentially with respect to the linear order 10<00<11 on the twoleading bits but uses the ensuing priority code p₁p₂ as the tiebreaker.

[0643] The block diagram 6500 in FIG. 65A is adapted into the blockdiagram 6700 as shown in FIG. 67A for the inclusion of bit consumptionand rotation. It assumes that γ(1)=1, γ(2)=2, γ(3)=3, etc. Threeregisters (6701, 6702, and 6703) represent the state of the automata(6710): As in FIG. 65A, there is the binary “connection state register”(6702) that indicates the prevailing bar/cross connection state andcontrols the two multiplexers (6711, 6712). There is also the binary“latch status register” (6703) that indicates whether the connectionstate is in the latched status or not. It is reset to UNLATCHED by theframe clock signal (6721). The “clock count register” (6701) stores thevalue CLOCK_COUNT, which advances along the bit clock from 0 to 5 andstays at 5 until the frame clock signal (6721) resets it to 0.

[0644] The illustrated scenario is when the packet 6751 starting withthe bits 1d₁p₁p₂d₂ . . . (=11011 . . .) and packet 6752 starting withthe bits 1d₁p₁p₂d₂ . . . (=11001 . . .) are ready to enter inputs 0 and1, respectively. Then the frame clock signal (6721) arrives and resetsthe CLOCK_COUNT to 0 and the latch status register 6703 to UNLATCHED.The value of the connection state register 6702, which happens to be BARin this case, remains unchanged.

[0645] At CLOCK_COUNT=1, the first bit of the packet 6751, namely, ‘1’,enters the first slot 6730-1 of the shift register (6730) connected tothe input 0, and the first bit of the packet 6752, namely, ‘2, entersthe first slot 6731-1 of the shift register (6731) connected to theinput 1, as shown in FIG. 67B. Since the automata cannot make decisionuntil the leading two bits from each of the packets have been read,nothing happens in the automata at this time.

[0646] At CLOCK_COUNT=2, the bit in the first slot of the shift register6730 (resp. 6731) is shifted to the second slot 6730-2 (resp. 6731-2).The second bit of the packet 6751 (resp. packet 6752), namely, ‘1’(resp. ‘2), enters the first slot of shift register 6730 (resp. shiftregister 6731). The automata sorts the initial two bits according to thelinear order of 10<00<11 with the bias toward input 0. Simply put, the0/1 value of the second bit from input 0 determines the new BAR/CROSSstate. In this case, the value of the connection state register ischanged to CROSS but the latch status register remains UNLATCHED, asshown in FIG. 67C.

[0647] At CLOCK_COUNT=3, each bit is further shifted to the next slot,namely, the bits in slots 6730-1, 6731-1, 6730-2, and 6731-2, arerespectively shifted to slots 6730-2, 6731-2, 6730-3, and 6731-3. Thethird bit of the packet 6751 (resp. packet 6752), which is the firstpriority bit, namely, ‘0’ (resp. ‘0’), enters the first slot of shiftregister 6730 (resp. shift register 6731). The automata starts using thepriority code in tie breaking. It sorts the third input bit with respectto the linear order of 0<1 (resp. 1<0) when the connection state is bar(resp. cross). In this case, the connection state is cross, and thesorting result is again a tie. Thus the connection state registerremains CROSS and the latch status register remains UNLATCHED, as shownin FIG. 67D. Meanwhile, the automata action readies the following pathconnections for the next clock tick.

[0648] The bit in the third slot of each of the shift registers, namely,slot 6730-3, and slot 6731-3, will not be shifted out.

[0649] The bit in the second slot of each of shift registers, namely,slot 6730-2, and slot 6731-2, will be shifted out but will arrivenowhere. That is, the bit will be discarded.

[0650] At CLOCK COUNT=4, the bits in the second slots (6730-2, 6731-2)are discarded. The bits in the first slots 6730-1 and 6731-1 are shiftedto the second slots 6730-2 and 6731-2, respectively. The fourth bit ofthe packet 6751 (resp. packet 6752), which is the second priority bit,namely, ‘0’ (resp. ‘2), enters the first slot of shift register 6730(resp. shift register 6731). The automata uses this fourth input bit inanother attempt of tie breaking. It sorts with respect to the linearorder of 0<1 (resp. 1<0) when the connection state is bar (resp. cross).In this case, the connection state is cross before the sorting. Thesorting result is decisive this time. It latches the connection stateinto bar, so the values of the connection state register and the latchstatus register become BAR and LATCHED, respectively, as shown in FIG.67E. Meanwhile, the automata action readies the following pathconnections for the next clock tick.

[0651] The bit in the third slot of each of shift registers, namely,slot 6730-3, and slot 6731-3, will be shifted out but will arrivenowhere. That is, the bit will be discarded.

[0652] The bits in the other slots of each shift register will not beshifted out.

[0653] The next bit from each input will go directly to the third slotof the shift register instead of the usual first slot.

[0654] At CLOCK COUNT=5, the activity bit in each shift register reachesa multiplexer (6711, or 6712) through the prevailing connection state,which is bar in the present scenario, and exits from the sorting cell.All path connections in the shift registers are reset to the normalshifting, and the connection state remains latched in bar. This scenariois shown in FIG. 67F. The CLOCK_COUNT is now at its maximum value of 5and will remain at 5 at subsequent bit clock signals. Thus the automataaction will simply repeat. Eventually the next frame clock signal willreset the CLOCK_COUNT to 0.

[0655] Remarks. Besides the switching function, the above-describedsorting cell performs the consumption of an address bit and the backwardrotation of the priority code. It is quite common for a routing cell ina particular application to perform ad hoc operations that modifypackets. Below are some examples of such operations.

[0656] (1) Upon entering an n-stage routing network a packet isinitially prefixed by the in-band control signal 1g₁g₂ . . . g_(n). Thestage-1 cell has to remove bit g₁ from the prefix so that the twoleading bits in the control signal entering stage 2 will be 1g₂ insteadof 1g₁. Suppose that the complete input packet, including the in-bandcontrol signal, must emerge intact upon exiting the routing network. Inthat case, the bit g₁ has to be preserved somehow. The simplest way isfor the stage-1 cell to rotate the in-band control signal 1g₁g₂ . . .g_(n) into 1g₂ . . . g_(n)g₁. Similarly, the stage-j cell, 1≦j≦n,rotates the in-band control signal 1g_(j)g_(j+1) . . . g_(n)g₁ . . .g_(j−1) into 1g_(j+1) . . . g_(n)g₁ . . . g_(j−1)g_(j). This bitrotation requires the buffering of Ω(n) bits by shift registers insidethe routing cell. The natural implementation is the same as for thebackward rotation of the priority code described above.

[0657] (2) Another common modification pertains to the switchingfunction when it detects output contention at the sorting cell. Considerthe scenario when two 0-bound packets arrive at a cell simultaneously.Only one of them may be routed to output 0; the other has to bedeflected to output 1 through the bar/cross state. Typically, once apacket is misrouted at some stage, it does not matter whether it iscorrectly routed at subsequent stages. The control signals in front ofdeflected packets can then be deliberately altered to yield priority toothers. One possibility is to change the control signal into the newvalue Ol1 and use it throughout the remaining stages. Such bitalteration can be easily implemented with shift registers similar tothose in FIG. 67A. Concomitantly the underlying linear order 10<00<11among values of control signals needs to be extended to the partialorder 10<0x<11. That is, every cell after stage 1 needs to be a sortingcell with respect to this partial order.

[0658] (3) In some applications, packet misrouting is more undesirablethan blocking. In such a case, the switching cell simply blocks thedeflected packet upon output contention, effectively turning the packetinto a string of 0s. The implementation is trivial.

[0659] 4. Multi-stage interconnection network of sorting cells

[0660] Definition H1: “routing network”. A “routing network associatedwith a partially ordered set” is a multi-stage network composed ofsorting cells associated with the said partially ordered set andpossibly 1×1 switches, where the in-band control signal of a packet maychange from stage to stage. This is simply called a “routing network”when the partially ordered set is understood or not of the concern inthe context.

EXAMPLE 2

[0661] A banyan-type network employing the self-routing mechanism aselucidated in Example 1 above is a routing network. This routing networkis composed of routing cells associated with the set {00, 10, 11} underthe linear order of 10<00<11, plus 1×1 switches at each stage forchanging the in-band control signal. The above linear order is due tothe presence of the activity bit. When activity bit is not present, therouting network can be constructed similarly but with routing cellsreplaced by 0-1 sorting cells associated with the set {0, 1} under thelinear order of 0<1. In either case, the in-band control signals arechanged from stage to stage, as described in Example 1.

[0662] Definition H2: “partial sorting network”. A “partial sortingnetwork associated with a partially ordered set” is a multi-stagenetwork composed of sorting cells associated with the partially orderedset and possibly 1×1 switches, where the in-band control signal at thebeginning of a packet is preserved through every stage for reuse at thenext stage. When the partial order is understood or not of the concernin the context, it is simply called a “partial sorting network”.

[0663] The term “partial sorting” suggests that the network does notnecessarily completely sort all input signals into a linear order.Commonly seen examples of sorting cells inside a partial sorting networkare the 0-1 sorting cell and the routing cell.

[0664] Note that the routing control over a partial sorting networknaturally qualifies as a form of self-routing. The switching decision ata cell in the network is determined simply by the comparison between thein-band control signals carried by the two input packets to the cell.The whole packet, including the in-band control signal is preservedthrough every stage.

EXAMPLE 3

[0665] Consider the 4×4 network 6800 as shown in FIG. 68. Let thecontrol signals be 3-bit. Fill each of the cells (6801) in the networkwith a sorting cell with respect to the natural order among 3-bitnumbers. The network then qualifies as a partial sorting network. The1×1 delay elements (6802) in the network serve only to maintain packetsynchronization across stages.

[0666] 5. Concentrators and the method of statistical line grouping overa banyan-type network

[0667] Self-routing over a banyan-type network is of interest because ofthe simple distributed control. However, all banyan-type networks areblocking. One way to adapt banyan-type networks into switch designs isto choose a network with the monotonically increasing (or decreasing)trace and guide and utilize the conditionally nonblocking properties ofits switch realizations. In order to invoke such a “conditionally”nonblocking property, the “condition” must first be met though. Forinstance, the condition for the decompressor property is the existenceof a rotation on the input addresses such that after the rotation, theactive input addresses are consecutive, and the correspondence betweenthe active I/O addresses are order-preserving. With the properpreprocessing and buffering at the inputs, the self-routing mechanismdescribed in the above becomes nonblocking for the point-to-pointswitching over a decompressor constructed from a banyan-type network.

[0668] Another way to adapt banyan-type networks to switch designs is bystatistical line grouping. Statistical line grouping creates a“multi-lined version” of any type of structure that involvesinterconnection lines among its internal elements. This techniquereplaces an interconnection line between two nodes with a bundle oflines. Concomitantly, the number of I/O of every node expandsproportionally, i.e., node is proportionally dilated. The underlyingstatistical principle is the “large-group effect” in diluting theblocking probability. This method is very practical since it does notrequire preprocessing and buffering of the input traffic.

[0669] When the method of statistical line grouping is applied to a2^(n)×2^(n) banyan-type network, it replaces every interconnection lineby a bundle of, say, b lines and also dilates every 2×2 cell into a2b×2b node. The resulting b2^(n)×b2^(n) network is called the b-lineversion of the 2^(n)×2^(n) network. The following example shows an8-line version of the 16×16 divide-and-conquer network.

EXAMPLE 4

[0670] With reference to FIG. 69, application of statistical linegrouping with the line-bundle size 8 to the 16×16 divide-and conquernetwork results in a 128×128 network (6900) comprising 16×16 nodes (e.g.6901). Instead of having two input ports and two output ports, each cellis dilated into a node (6901) with two groups (6902, 6903) of inputports and two groups (6904, 6905) of output ports. The two output groupsare called 0-output group (6904) and the 1-output group (6905).Similarly, the two input groups are called 0-input group (6902) and the1-input group (6903). The output groups of all nodes at a stage areconnected to the input groups of nodes at the next stage.

[0671] The key issue on the method of statistical line grouping lies inthe choice of the 2b×2b switch for filling the dilated node. Inprinciple a 2b×2b switching fabric of any style, such as a crossbar or ashared-buffer-memory switch, can fill the dilated node provided thecomplexity is satisfactorily low in both the switching control and theswitching elements. The following criteria are usually considered whenchoosing the switch to fill the dilated node:

[0672] Ideally the switching control of the 2b×2b switch need becompatible with self-routing over banyan-type networks.

[0673] Moreover, the switch does not have to be nonblocking but needs topossess some “partial property” of being nonblocking that is articulatedin the sequel.

[0674] Definition H3: “m-to-n concentrator”. For n<m, an m-to-nconcentrator is an m×m switch having a “0-output group” comprising them−n outputs with the smallest addresses, that is, from 0 to m−n−1, and a“1-output group” comprising the remaining n outputs such that when thegiven input signals to the concentrator are subject to a partial order,then any signal routed to the 0-output group is never greater than anysignal routed to the 1-output group under the said order. Thus, anm-to-n concentrator can be regarded as a device which is capable ofpartitioning the m input signals (including real data input signals andartificial idle expressions) into two groups: the group of n largestsignals, which are routed to the 1-output group, and the group of m−nsmallest signals, which are routed to the 0-output group. As per thegraph representation, by default the m-to-n concentrator is the onewherein the upper m−n output ports form the 0-output group and the lowern output ports form the 1-output group.

[0675] In some references in the background art, there is notion of an“m×n concentrator”, which means an m×n switch, n<m, such that thelargest n input signals are routed the n output ports. Thus an m-to-nconcentrator defined above can be reduced to an “m×n concentrator” bynot implementing the output ports in the 0-output group. In order toavoid terminology ambiguity, the notion of an “m×n concentrator” willnot be adopted. Every concentrator in this context refers to an m-to-nconcentrator for some m and some n, n<m.

EXAMPLE 5

[0676]FIG. 70A shows an 8-to-4 concentrator 7000 constructed by an 8×8partial sorting network which is a 4-stage interconnection network ofsorting cells. The control signals are 3-bit. All sorting cells (7001,7002) are associated with the natural order among 3-bit numbers exceptthat the two outputs of each of the sorting cells 7002 are inverselypositioned. As shown in the figure, the arrow on a sorting cell alwayspoints to output-1, which receives the signal with the larger valuebetween the two. The figure demonstrates a test run over thisconcentrator. The eight output signals are partitioned into two groups(7020, 7021), with the group of smallest four signals (7020), namely,000, 011, 101, and 100, at the 0-output group (7010) of theconcentrator, and the group of largest four signals (7021), namely, 111,110, 110, and 110, at the 1-output group (7011). Note that the orderamong signals within each group is arbitrary.

EXAMPLE 6

[0677]FIG. 70B shows a test run of 2-bit signals through another 8-to-4concentrator 7050 which shares the same underlying 8×8 partial sortingnetwork employed by the concentrator 7000 in Example 5. This time thesorting cells (7051, 7052) in the network are routing cells, i.e.,sorting cells associated with the linear order of 10<00<11. Again, thetwo outputs of each of the sorting cells 7052 are inversely positioned.The eight output signals are partitioned into two groups (7070, 7071),with the group of smallest four signals (7070), namely, 00, 10, 00, and10, at the 0-output group (7060) of the concentrator, and the group oflargest four signals (7071), namely, 11, 11, 00, and 11, at the11-output group (7061).

[0678] Remark. Sorting cells associated with different partially orderedsets incurs different complexities in their physical implementation. Forexample, the implementation of a sorting cell supporting prioritytreatment, as shown in FIGS. 67A-F, is much more complex than one whichdoes not support. The concentrator 7000 in Example 5 and theconcentrator 7050 in Example 6 share the same network structure, but thesorting cells in them are associated with two different partiallyordered sets and hence the two concentrators are physically different.

[0679] One of the criteria mentioned in the above in choosing the properswitch to fill the dilated node in a b-line version of a banyan-typenetwork is a “partial property” of being nonblocking. Explicitly thispartial property means the guarantee to route the maximum possiblenumber of 0-bound signals to the 0-output group and the maximum possiblenumber of 1-bound signals to the 1-output group. For a 2b-to-bconcentrator is composed of interconnected routing cells (plus possibly1×1 elements), the nature of a concentrator in routing the smallest m−nsignals to the 0-output group and the largest n signals to the 1-outputgroup is precisely equivalent to this guarantee. Therefore, a 2b-to-bconcentrator is composed of interconnected routing cells meets thiscriterion perfectly for filling the dilated node in a b-line version ofa banyan-type network.

[0680] The other criterion in choosing the proper switch to fill thedilated node in a b-line version of a banyan-type network is thecompatibility with self-routing over the banyan-type network. The2b-to-b concentrator is composed of interconnected routing cells againmeets the criterion perfectly. As a switch constructed by a partialsorting network, a concentrator possess a natural self-routingmechanism. When the 2b-to-b concentrator fills every dilated node of theb-line version of the banyan-type network, the whole network becomes alarge multi-stage interconnection network of routing cells. The marriagebetween the self-routing mechanism over the partial sorting networkswith the self-routing mechanism over the banyan-type network, as to bedetailed in the next sub-section, creates a self-routing mechanism overthe said large multi-stage interconnection network of sorting cells.

[0681] Remark. As before, if idle expressions are disallowed in thesystem, the 2b-to-b concentrator is composed of interconnected routingcells can be substituted by a 2b-to-b concentrator is composed ofinterconnected 0-1 sorting cells. The same applies throughout the nextsub-section.

[0682] 6. Self-routing over a multi-stage interconnection network ofconcentrators

[0683] Hereafter unless otherwise specified, all concentrators refer tothose constructed by partial sorting networks.

[0684] Recall the classification of multi-stage networks of sortingcells into routing networks and partial sorting networks. The in-bandcontrol signal of a packet is preserved through a partial sortingnetwork. On the other hand, it changes from stage to stage when thepacket traverses a routing network, e.g., a banyan-type network underbasic self-routing control. The b-line version of a 2^(n)×2^(n)banyan-type network is a hybrid between a routing network and a partialsorting network when every dilated node in it is filled with a 2b-to-bconcentrator is composed of interconnected routing cells. The hybridnetwork may be viewed as composed of n “super stage” of concentrators.At each super stage, a packet traverses through a partial sortingnetwork, which is by itself a multi-stage network of routing cells, andthe in-band control signals of a packet changes only betweensuper-stages.

[0685] The b2^(n) outputs of the hybrid network are in 2^(n) groups ofthe size b. The destination of a packet is an output group rather thanan individual output in an output group. In accordance with the presentinvention, upon entering a generic 2^(n)×2^(n) banyan-type network withthe guide γ(1), γ(2), . . . , γ(n), a packet destined for the output atthe address d₁d₂ . . . d_(n) is preceded by the routing tag1d_(γ(1))d_(γ(2)) . . . d_(γ(n)) and the in-band control signal tostage-j switching cell is 1d_(γ(j)). The same routing tag still appliesin the b-line version of the banyan-type network in which every dilatednode is filled by a 2b-to-b concentrator when the packet is destined forthe output group at the address d₁d₂ . . . d_(n), and, for 1≦j≦n,, andthe in-band control signal to a concentrator in the j^(th) super-stageis 1d_(γ(j)).. More explicitly, the in-band control signal to everyrouting cell in a concentrator at the j^(th) super-stage is 1d_(γ(j)).As the packet progressed through the hybrid network composed of manystages of routing cells, the in-band control signal to a routing cellchanges only upon the exit from a concentrator. That is, the bitd_(γ(j)) is consumed not by any generic routing cell inside aconcentrator at the j^(th) super-stage but rather by certain extracircuitry installed at the output end of the concentrator. This extracircuitry handles each packet separately and hence consists of 2bparallel 1×1 switching elements. There may exist other 1×1 elements inthe 2b-to-b concentrator, e.g., delay elements in maintaining thesynchronization across the stage and annihilators of misrouted packets.

EXAMPLE 7

[0686] The guide of the 16×16 divide-and-conquer network is the sequence1, 2, 3, 4. The network 6900 shown in FIG. 69 is the 8-line version ofthe 16×16 divide-and-conquer network. This is a 128×128 network, andeach of the dilated nodes is 16×16. Thus fill every dilated nodes (e.g.6901) with a 16-to-8 concentrator consists of multi-stage interconnectedrouting cells plus 1×1 elements. The 128 outputs of this network arepartitioned into 16 output groups of the size 8. Each output group isassociated with a 4-bit address. A packet is destined for an outputgroup rather than a specific output in the group. That is, the routingof a signal to any port within a group is just as good as routing to anyother port in the group. When the destined output group is at theaddress d₁d₂d₃d₄, the initial routing tag of the packet is1d_(γ(1))d_(γ(2))d_(γ(3))d_(γ(4))=1d₁d₂d₃d₄. The in-band control of thepacket to every routing cell in the concentrator at the 1^(st)super-stage is 1d₁. Upon exiting that concentrator, the bit d, in therouting tag is consumed by a 1×1 element in the concentrator. Thus therouting tag upon entering the 2^(nd) super-stage is 1d₂d₃d₄. And so on.

[0687] A practical switch must cope with output contention, trafficfluctuation, burstiness, and so forth, and some alternate-routingingredients, explicitly or implicitly, help resolve these problems. Thekey is not to complicate the switching control too much throughalternate routing. From the macro perspective, the above describedhybrid network inherits the unique-routing characteristic from thebanyan-type network and thereby allows very simple control. The microview, on the other hand, reveals the alternate-routing nature concealedinside individual concentrators. The good news is the natural marriagebetween the self-routing control of concentrators and the self-routingcontrol over the banyan-type network into an extremely simpleself-routing control over the hybrid network.

[0688] Recall that the self-routing control mechanism over 2^(n)×2^(n)banyan-type networks can be extended to 2^(n)×2^(n) k-stagebit-permuting networks. Therefore, when the underlying banyan-typenetwork of the above hybrid network is replaced by a bit-permutingnetwork, the overall self-routing control over the resulting hybridnetwork is extremely similar to the above, that is, it is simply themarriage between the self-routing control of concentrators and theself-routing control over the replacing bit-permuting network. Moreprecisely, when the replacing bit-permuting network is a 2^(n)×2^(n)k-stage bit-permuting network with the guide γ(1), γ(2), . . . , γ(k),where γ is a mapping from the set {1, 2, . . . , k} to the set {1, 2, .. . , n}, a packet destined for the binary output group address d₁d₂ . .. d_(n) is initially prefixed with the routing tag 1d_(γ(1))d_(γ(2)) . .. d_(γ(k)). For 1≦j≦k, the in-band control signal to a concentrator inthe j^(th) super-stage is 1d_(γ(j)), and the second bit in this controlsignal is consumed upon the exit from the concentrator.

[0689] 7. Multicast concentrators

[0690] A concentrator is composed of interconnected routing cells is apoint-to-point switch that routes 0-bound, 1-bound, and idle packets to0- and 1-output groups; it satisfies the desirable characteristic ofalways routing the maximum possible number of 0-bound (resp. 1-bound)signals to its 0-output group (resp. 1-output group). For a multicastswitch that routes 0-bound, 1-bound, idle, and bicast packets to 0- and1-output groups, a corresponding desirable characteristic is to routethe maximum total number of 0-bound and bicast signals to the 0-outputgroup and the maximum total number of 1-bound and bicast signals to the1-output group. This concept is formulated in the next definition.

[0691] Definition H4: “m-to-n multicast concentrator”. For n<m, an m×mswitch having a “0-output group” comprising the m−n outputs with thesmallest addresses, that is, from 0 to m−n−1, and a “1-output group”comprising the remaining n outputs and receiving 0-bound, 1-bound, idleand bicast input signals is called an m-to-n “multicast concentrator” ifit routes the maximum total number of 0-bound and bicast signals to the0-output group and the maximum total number of 1-bound and bicastsignals to the 1-output group.

[0692] An m-to-n multicast concentrator, by its definition, alwaysguarantees that the total number of 0-bound (resp. 1-bound) and bicastsignals routed to its 0-output group is the maximum possible. Thisguarantee can be equivalently expressed as: by letting the numbers of0-bound, 1-bound, bicast, and idle signals that arrive at an m-to-nmulticast concentrator be x₀, x₁, x_(b), and m-x₀x₁-x_(b), respectively,then the total number of 0-bound and bicast signals that arrive at0-output group of the multicast concentrator is min{m−n, x₀+x_(b)}, andthe total number of 1-bound and bicast signals that arrive at 1-outputgroup is min{n, x₁+x_(b)}. A multicast concentrator is a switch servingfor the combined objective of concentration and multicasting. In theabsence of bicast signals, its function reduces to the same as aconcentrator.

[0693] In accordance with the present invention, an m-to-n multicastconcentrator can be constructed from an m-to-n concentrator as follows:an m-to-n concentrator constructed from a partial sorting network ofinterconnected routing cells can be adapted into an m-to-n multicastconcentrator by replacing each of the routing cells with a bicast cellas defined in Definition G6.

EXAMPLE 8

[0694] The 8-to-4 concentrator 7000 depicted in FIG. 70A can be adaptedinto an 8-to-4 multicast concentrator 7100 depicted in FIG. 71A asfollows. The underlying interconnection network is unchanged, but abicast cell replaces every sorting cell in the concentrator. As before,the arrow on a bicast cell always points to output-1. In the test run ofrouting packets through this multicast concentrator as illustrated inFIG. 71A, the eight input packets a, b, c, d, e, f, g, and h arerespectively idle, 0-bound, bicast, 0-bound, bicast, bicast, 1-bound,and 1-bound and respectively represented as ‘a(I)’, ‘b(0)’, ‘c(B)’,‘d(0)’, ‘e(B)’, ‘f(B)’, ‘g(1)’, and ‘h(1)’. Among the three bicastpackets, only packet c(B) is bicasted, that is, it successfully convertsitself into a 0-bound copy and a 1-bound copy, and this conversionoccurs at the bicast cell 7102-1 when ‘c(B)’ meets the idle packet‘a(I)’ and thereby produces ‘c(0)’ and ‘c(1)’. The other two bicastpackets ‘e(B)’ and ‘f(B)’ remain bicast packets throughout the multicastconcentrator.

[0695]FIG. 71B shows another test run, with the same input packets asbefore except for idle packets d and g in this run. This time two of thebicast packets, c(B) and e(B), are bicasted into 0-bound and 1-boundcopies at the bicast cells 7101-1 and 7102-2. The third bicast packetf(B) remains a bicast packet throughout the multicast concentratordespite the presence of three idle packets at the beginning. Recall thatan m-to-n multicast concentrator only guarantees that the total numberof 0-bound and bicast packets routed to 0-output group is min{m−n,X₀+x_(b)} and the total number of 1-bound and bicast packets to 1-outputgroup is min{n, x₁+x_(b)}. In this case, m=8, n=4, x₀=2, x₁=0, x_(b)=3and min{m−n, x₀+x_(b)}=min{8−4, 2+3}=4. The total number of 0-bound andbicast packets routed to 0-output group is indeed equal to min{m−n,x₀+x_(b)}, as verified by the four packets at the 0-output group 7110,namely, the two 0-bound packets b(0) and h(0), and the two 0-boundcopies c(0) and e(0) of the two bicast packets c and e, respectively.Similarly, the total number of 1-bound and bicast packets routed to1-output group is min{4, 0+3}=3, as verified by the bicast packet f(B)and the two 1-bound copies, c(1) from c and e(1) from e, at the 1-outputgroup 7171.

[0696] Priority classification of 0-bound and 1-bound signals can beeasily blended into the in-band control of the bicast cell as atiebreaker upon output contention. Suppose the ‘0-bound’ value of asignal is replaced with the values ‘hi 0-bound’, . . . , ‘lo 0-bound’,and the ‘1-bound’ value with the values ‘hi 1-bound’, . . . , ‘lo1-bound’ (Here “hi” and “lo” are shorthand for the highest and lowestpriorities.) Then the in-band control of a bicast cell can be modifiedinto:

[0697] (1) When the input signals to the bicast cell are a bicast signaland an idle signal, then output-0 (resp. output-1) produces a lo 0-bound(resp. lo 1-bound) signal.

[0698] (2) Otherwise, the bicast cells perform sorting with respect tothe partial order:

‘hi 0-bound’< . . . ‘lo 0-bound’<‘idle’<‘lo 1-bound’< . . . <‘hi1-bound’ and

‘hi 0-bound’< . . . ‘lo 0-bound’<‘bicast’<‘lo 1-bound’< . . . <‘hi1-bound’.

[0699] Such a modified multicast concentrator then guarantees that thetotal number of 0-bound (resp. 1-bound) and bicast signals at the0-output group (resp. 1-output group) is the maximum possible accordingto the priority class. This guarantee does not hold, however, if therule (1) were allowed to generate packets not of the lowest priority.

EXAMPLE 9

[0700]FIG. 72A illustrates the operation of the multicast concentrator7200 with priority treatment. In this example, the 0-bound and 1-boundpackets are simply divided into two priority classes, the normal 0- and1-bound packets and the priority 0- and 1-bound packets, indicated by asuperscript ‘+’, e.g. the packet ‘a(1+)’. If the aforementioned rule (1)were to generate packets not of the lowest priority, and in thisparticular example, generate priority 0- and 1-bound packets out of anon-priority bicast packet, as illustrated in FIG. 72B where the bicastpacket ‘d(B)’ are bicasted into a normal 0-bound packet ‘d(0)’ and apriority 1-bound packet ‘d(1)’ at the bicast cell 7251, and the bicastpacket ‘g(B)’ are bicasted into a normal 0-bound packet ‘g(0)’ and apriority 1-bound packet ‘g(1)’ at the bicast cell 7252, then a normal1-bound packet, in this case, the packet ‘h(1)’ (7232), would reach the1-output group (7221) while a priority 1-bound packet, in this case, thepacket ‘a(1⁺)’ (7231), would reach the 0-output group (7220).

[0701] 8. Self-routing multicasting over a banyan-type network

[0702] A 2^(n)×2^(n) multicast switch allows a packet to be destined foran arbitrary subset of the 2^(n) output addresses. The overhead inencoding an arbitrary set of destination addresses is costly. In fact,the number of bits cannot be reduced to less than 2^(n). However, thisexcessive overhead can be drastically trimmed when certain practicallyreasonable constraints are imposed on the set of the destinations of apacket. One constraint is that the set of destination addresses of everypacket should be a “rectangle”, as defined in the sequel.

[0703] Definition H5: “rectange”. Regard the entirety of 2^(n) outputaddresses as the n-dimensional binary cube {0, 1}×{0, 1}× . . . ×{0, 1}.A subset in the form of S₁×S₂× . . . ×S_(n), where each S_(j) is anonempty subset of {0, 1}, will be called a “rectangular set of outputaddresses”, or simply a “rectangle”. If a rectangle contains 2^(k)output addresses, it is called a “k-dimensional rectangle”.

EXAMPLE 10

[0704] A generic binary address of a 2⁶×2⁶ banyan-type network isb₁b₂b₃b₄b₅b₆. The entirety of 2⁶ output addresses is a 6-dimensionalbinary cube S₁×S₂× . . . ×S₆, where each S_(j) ={0, 1} corresponds tothe two possible values of b_(j). One of the rectangles of this6-dimentional binary cube can be the subset in the form of {0,1}×{0}×{0, 1}×{1}×{0, 1}×{1}, which contains 23 output addresses,namely, 000101, 000111, 001101, 001111, 100101, 100111, 101101, and101111, so this is a 2-dimentional rectangle. The number of3-dimensional rectangles in the 6-dimensional binary cube is2⁶⁻³*₆C₃=8*(6*5*4)/(3*2)=160.

[0705] The aforementioned constraint requires that the set ofdestination addresses of every packet to be a rectangle. For a practicalapplication under this restriction, output addresses of the switch mustbe tactically assigned so that a packet's multicast destinations areusually covered tightly by just a rectangle or two. For example, on abroadband switch for heterogeneous applications, a rectangle of outputaddresses may be assigned to cable TV subscribers.

[0706] An inventive self-routing mechanism over the multicast switchingin any 2^(n)×2^(n) banyan-type network based on such a constraint aredisclosed as follows. Consider a generic quaternary symbol with the fourvalues ‘0-bound’, ‘1-bound’, ‘idle’, and ‘bicast’. The four valuescorrespond to subsets of {0, 1} by:

[0707] {0}=‘0-bound’

[0708] {1}=‘1-bound’

[0709] {0, 1}=‘bicast’

[0710] null=‘idle’

[0711] Thus a generic rectangle S₁×S₂× . . . ×S_(n) can be representedby a quaternary sequence Q₁, Q₂, . . . , Q_(n), where each Q_(j) here isa quaternary symbol in any of the three values: ‘0-bound’, ‘1-bound’,and ‘bicast’. Each symbol Q_(j) cannot be equal to ‘idle’, because in arectangle, each S_(j) cannot be a null set. When a packet is destinedfor a set of output addresses that happens to be a rectangle representedas Q₁, Q₂, . . . , Q_(n), each Q_(j) indicates the preference of thej-th bit of its destination addresses.

[0712] A quaternary symbol can be encoded by two bits. A natural codingscheme here is ‘0-bound’=10, ‘1-bound’=11, ‘idle’=00, and ‘bicast’=10.For example, the rectangle {0, 1}×{0}×{0, 1}×{1}×{0, 1}×{1} in Example10 can be represented by a quaternary sequence Q₁=‘bicast’,Q₂=‘0-bound’, Q₃=‘bicast’, Q₄=‘1-bound’, Q₅=‘bicast’, Q₆=‘1-bound’, orunder the natural coding scheme, Q₁=‘01’, Q₂=‘10’, Q₃=‘01’, Q₄=‘11’,Q₅=‘01’, Q₆=‘11’. Conversely, if the destination addresses of a packetis represented by a sequence Q₁=‘11’, Q₂=‘10’, Q₃=‘01’, Q₄=‘11’,Q₅=‘10’, Q₆=‘01’, the packet is said to be destined for the rectangle{1}×{0}×{0, 1}×{1}×{0}×{0, 1} which comprises the output addresses100100, 100101, 101100, and 101101.

[0713] In accordance with the present invention, when a packet firstenters a 2^(n)×2^(n) banyan-type network with the guide γ(1), γ(2), . .. , γ(n), the packet destined for the rectangle Q₁, Q₂, . . . Q_(n), isprefixed with the routing tag

Q_(γ(1))Q_(γ(2)) . . . Q_(γ(n))

[0714] The idle packet has the routing tag in which all quaternarysymbols are ‘idle’ and is a string of ‘0’ bits under the natural codingscheme.

[0715] For each stage j, 1≦j≦n, the in-band control signal used by therouting control at that stage is the symbol Q_(γ(j)), which is theneither consumed or rotated to the end of the routing tag at the stage.As a result, the leading symbol upon entering each stage j, 1≦j≦n, isQ_(γ(j)). The self-routing control at each stage can be perfectlyexecuted by filling each cell of the 2^(n)×2^(n) banyan-type networkwith a bicast cell.

[0716] This self-routing mechanism for multicast switching can beextended to 2^(n)×2^(n) k-stage bit-permuting networks. Consider ageneric 2^(n)×2^(n) k-stage bit-permuting network with the guide γ(1),γ(2), . . . , γ(k), where y is a mapping from the set {1, 2, . . . , k}to the set {1, 2, . . . , n}. A packet destined for the rectangle Q₁,Q₂, . . . , Q_(n), is prefixed with the routing tag Q_(γ(1))Q_(γ(2)) . .. q_(γ(k)). The in-band control signal of a packet to a bicast cell ateach stage j, 1≦j≦k, is the leading symbol Q_(γ(j)).

[0717] Priority treatment can be integrated into this self-routingmechanism in the same way as before. Thus let the r-bit pattern p₁ . . .p_(r) represent the priority class. When a packet first enters thenetwork, the packet header is prefixed with

Q_(γ(1))p₁ . . . p_(r)Q_(γ(2)) . . . Q_(γ(n))

[0718] The bicast cell can be modified for the priority treatmentsimilarly as before. The primary in-band control signal used at eachstage j is still Q_(γ(j)), while the priority code p₁ . . .p_(r) servesas the tiebreaker when the two packets are both 0-bound or both 1-bound.The switching control at each stage consumes the leading quaternarysymbol (or rotated it to the end of the routing tag) and rotates thepriority code to the position behind the next quaternary symbol.Therefore, the underlying methodology for the realization of this(multicast) self-routing mechanism over a banyan-type network and theimplementation of the related circuitry is very similar to the case ofbasic (point-to-point) self-routing mechanism employed in banyan-typenetwork.

[0719] 9. Statistical line grouping over a banyan-type network for multicast switching

[0720] In parallel with the self-routing mechanism over a multi-stageinterconnection network of concentrators, a similar inventiveself-routing mechanism is disclosed for the multi-stage interconnectionnetwork of multicast concentrators.

[0721] Take an m-to-n concentrator constructed from a partial sortingnetwork of interconnected routing cells. As stated in the sub-sectionH7, such a concentrator can be adapted into an m-to-n multicastconcentrator by replacing each of the routing cells with a bicast cell.Given a 2^(n)×2^(n) banyan-type network, say, with the guide γ(1), γ(2),. . . ,γ(n). Fill each dilated node in the b-line version of thebanyan-type network with a 2b-to-b multicast concentrator soconstructed. The result is a multicast version of the hybrid networkdescribed in the sub-section H6 and hence will be referred to as the“multicast hybrid network”. The multicast hybrid network consists of n“super stage” of multicast concentrators. A self-routing mechanism overthis multicast hybrid network, in a fashion much parallel to thepoint-to-point case, is disclosed below.

[0722] The b2^(n) outputs of the multicast hybrid network are in 2^(n)groups of the size b. Each destination of a packet is an output grouprather than an individual output in an output group. At a super stage, apacket traverses through a multicast concentrator, which is amulti-stage interconnection network of bicast cells. In accordance withthe present invention, upon entering the multicast hybrid network, apacket destined for output groups with the rectangular set of addressesencoded by Q₁, Q₂, . . . , Q_(n) is prefixed with the routing tagQ_(γ(1))Q_(γ(2)) . . . Q_(γ(n)). The in-band control signal to amulticast concentrator in the j^(th) super-stage is Q_(γ(j)), and thisquaternary symbol in the routing tag is consumed or rotated to the endof the routing tag by the j^(th) super-stage. More explicitly, thein-band control signal to every bicast cell in a multicast concentratorat the j^(th) super-stage is Q_(γ(j)) except that a bicast packet (withQ_(γ(j))=‘bicast’) and an idle packet (with Q_(γ(j))=‘idle’) arereplaced by a 0-bound packet (with Q_(γ(j))=‘0-bound’) and a 1-boundpacket (with Q_(γ(j))=‘1-bound’) when they meet at a bicast cell. Theconsumption of the quaternary symbol Q_(γ(j)) or its rotation to the endof the routing tag is not by any generic bicast cell inside a multicastconcentrator at the j^(th) super-stage but rather by certain extracircuitry installed at the output end of the multicast concentrator.This extra circuitry handles each packet separately and hence consistsof 2b parallel 1×1 switching elements. There may exist other 1×1elements in the 2b-to-b multicast concentrator, e.g., delay elements inmaintaining the synchronization across the stage and annihilators ofmisrouted packets.

[0723] Similar to the case of self-routing over a multi-stageinterconnection network of concentrators, when the underlyingbanyan-type network of a multi-stage interconnection network ofmulticast concentrators is replaced by a more general bit-permutingnetwork, the self-routing control mechanism still applies. Moreprecisely, when the replacing bit-permuting network is a 2^(n)×2^(n)k-stage bit-permuting network with the guide γ(1), γ(2), . . . , γ(k),where γ is a mapping from the set {1, 2, . . . , k} to the set {1, 2, .. . , n}, a packet destined for output groups with the rectangular setof addresses encoded by Q₁, Q₂, . . . , Q_(n) is prefixed with therouting tag Q_(γ(1))Q_(γ(2)) . . . Q_(γ(k)). For 1≦j≦k, the in-bandcontrol signal to a multicast concentrator in the j^(th) super-stage isQ_(γ(j)), and this quaternary symbol in the routing tag is consumed orrotated to the end of the routing tag by the j^(th) super-stage. Theremaining parts of the control coincide with the above.

[0724] I: PHYSICAL IMPLEMENTATION OF SWITCHING FABRICS CONSTRUCTED FROMRECURSIVE 2-STAGE INTERCONNECTION

[0725] As mentioned in Sections B, a switching fabric can be based onrecursive invocation of the technique of 2-stage construction. That is,a multi-stage network is constructed by a recursive procedure where thegeneric step is “2-stage interconnection” and then each node in themulti-stage network so constructed is filled with an appropriateswitching element. Throughout this section,

[0726] (a) the term “2-stage interconnection” includes plain 2-stageinterconnection, 2X interconnection, X2 interconnection, and generalized2-stage interconnection, unless otherwise specified,

[0727] (b) the procedure of the recursive invocation of the 2-stageinterconnection is called the “recursive 2-stage interconnection” or“recursive 2-stage construction”, and

[0728] (c) the multi-stage network so constructed is called a “recursive2-stage interconnection network”.

[0729]FIG. 14 in Section B depicts a 30×18 3-stage network 1400 fromsuch a recursive 2-stage construction. Sometimes the method ofstatistical line grouping may be applied so that a switching fabric isactually based on a multi-line version of a recursive 2-stageinterconnection network. FIG. 69 depicts the example of the 8-lineversion of the 16×16 divide-and-conquer network (6900), which constructsa 128×128 switch when every node in it is filled by an appropriate 16×16switching element.

[0730] A generic step of recursive 2-stage interconnection is between anarray of input nodes and an array of output nodes. The physicalimplementation of this generic step is by wiring between an array of“input switching elements” and an array of “output switching elements”.In the case of a step of 2-stage interconnection in a b-line version ofa recursive 2-stage interconnection network, there would be a bundle ofb wires connecting between every input switching element and everyoutput switching element. This physical implementation can be at any ofthe following five levels.

[0731] 1. Level I: Inside-chip implementation. The inside-chipimplementation means physical realization inside an IC chip. The I/Oswitching elements are usually some primitive switching circuitries. Themost common primitive switching circuitry is a 2×2 switching cell. Atrivial physical realization for it has been depicted in FIG. 65A. Someother primitive switching circuitries, to name a few, can be 2×2multiplexer, 1×2 demultiplexer, 2×2 expander cell, and so on. This levelof implementation can be recursively applied within an IC chip. Thislevel is simply referred to as “chip-level” or just “C-level”.

[0732] For example, the 16×16 divide-and conquer network (5100) shown inFIG. 51, which is constructed from the recursive 2-stage interconnectionof cells, can be physically realized inside an IC chip where allswitching elements are 2×2 switching cell.

[0733] 2. Level II: PCB implementation. The PCB implementation meansphysical realization on a PCB (printed circuit board). Each I/Oswitching element for this level is an IC chip. This level ofimplementation can be recursive applied within a PCB. This level issimply referred to as “PCB-level” or just “P-level”.

[0734] For example, the recursively constructed 30×18 network 1400 asdepicted in FIG. 14 can be implemented on a PCB wherein the three typesof nodes, namely, 2×2 nodes 1401, 3×3 nodes 1402 and 5×3 nodes 1404, areimplemented by three different IC chips.

[0735] 3. Level III: Orthogonal packaging. This level of implementationis the physical realization of an “orthogonal package”, which includestwo orthogonal stacks, one stack consisting of input switching elementsand the other of output switching elements such that every inputswitching element contacts every output switching elementperpendicularly and the interconnection between them is through thecontact point. Each I/O switching element for this level is a PCB, or anIC chip packaged into an equivalent of a small board. This level issimply referred to as “orthogonal-level” or just “O-level”.

[0736] The implementation of plain 2-stage interconnection by orthogonalpackage is depicted by FIG. 73A. External input and output ports are7300 and 7301 respectively, and the I/O switching elements deployed arePCBs 7302 and 7303. For the plain 2-stage interconnection theinterconnection between input switching elements and output switchingelements is through the contact points 7304; to implement thegeneralized 2-stage interconnection, some local rearrangement on 7305and 7306 prior to the interconnection may be needed.

[0737] Note that this level of implementation requires both the I/Oswitching elements to be planar. Since an orthogonal package is notplanar, it cannot be recursively used in another step of orthogonalpackaging. Therefore, the next level, interface-board packaging, isinvented to carry on recursive construction in the fashion ofperpendicular placements of switching elements.

[0738] 4. Level IV: Interface-board packaging. This level ofimplementation is the physical realization of an “interface-boardpackage”. The interface-board package includes a printed circuit boardas the “interface board”, attached with a number of input switchingelements and a number of output switching elements such that the wiringon the interface board creates the interconnection between every inputswitching element and every output switching element. By the wirings onthe interface board, any output port of any input switching element canin principle be connected to any input port of any output switchingelement, in other words, all kinds of 2-stage interconnections betweenI/O switching elements can be achieved by the presence of this “magic”interface board. Therefore, the attachment of the I/O switching elementsto the board as well as their orientation can be in various ways,varying from design to design, as long as the output ports from theinput switching elements and the input ports from the output switchingelements are in contact with the appropriated wirings on the interfaceboard such that those wirings achieve the required interconnection. Forexample, both the I/O switching elements can be attached on the sameside of the interface board; or the input switching elements areattached on one side of the interface board, and the output switchingelements on the opposite side; or even a mixture of I/O switchingelements are attached on one side of the interface board, and a mixtureof I/O switching elements on the opposite side. To simplify thedescription but without losing generality, it is assumed in this contextthat all the input switching elements are on one side and all the outputswitching elements on the opposite side. Each I/O switching element forthis level can be an IC chip, a PCB, or an orthogonal package; it canalso be an interface-board package when this level of implementation isrecursively applied. This level is simply referred to as“interface-level” or just “I-level”.

[0739] In the example of FIG. 73B, the interface board 7307 is insertedbetween two orthogonal stacks of PCBs in order to implement thegeneralized 2-stage interconnection.

[0740] In the example of FIG. 74, the I/O switching elements areorthogonal packages, 7402 and 7403. The input switching elements aremarshaled on the upper surface 7407 of a rectangular interface boards,and the output switching elements are marshaled on the lower surface7408. FIG. 74B provides more detail of the construction above theinterface board. The interface board 7409 turns the 2-dimensional outputarray 7405 of an input switching element 7402 into a linear horizontalarray 7410. Symmetrically, the interface board also turns the2-dimensional input array of an output switching element into a linearvertical array. Thus the relative orthogonal placement between thelinear horizontal arrays (7410) from input switching elements above theinterface board and the linear vertical arrays from output switchingelements below the interface board is logically equivalent to that inorthogonal packaging.

[0741] 5. Level V: Fiber-array packaging. This level of implementationis the physical realization of an “fiber-array package”. Each I/Oswitching element in a fiber-array package can be an IC chip, a PCB, anorthogonal package, or an interface-board package; it can also be afiber-array package when this level of implementation is recursivelyinvoked. Interconnection lines between input switching elements andoutput switching elements are implemented by physically flexiblecommunication medium, exemplified by optic fibers. This level is simplyreferred to as “fiber-level” or just “F-level”.

[0742] It is worth pointing out a difference between the recursiveapplication at the C- or P-level and the recursive application at the I-or F-level. A step at the I- or F-level results an interface-boardpackage or a fiber-array package, which can be used in the nextrecursive step. In contrast, a step at the C- or P-level does notnecessarily result in a whole IC chip or PCB; rather, such a step onlylogically results in a larger input or output switching element for thenext step of implementation. For example, the 6×6 networks 1403constructed from the 2-stage interconnection of 2×2 nodes (chips) 1401and 3×3 nodes (chips) 1402 are not PCBs, they are just used tointerconnect with another group of 5×3 nodes (chips) 1404 in the nextstep to produce the resulting 30×18 network, and the whole process is ona single PCB.

[0743] In practice there is an ordering of precedence relationship amongthese five levels of physical implementation. A step of inside-chipimplementation can be followed by steps of implementation at any of thefive levels. A step of PCB implementation can be followed by steps ofimplementation at any level except the C-level because a PCB cannot beused as an I/O switching element for the recursive construction insidean IC chip. A step of orthogonal packaging can be followed by a step ofimplementation at only the I- or F-level because an orthogonal packagecannot be used as an I/O switching element in the construction inside anIC chip, on a PCB, or in another orthogonal package, A step at the I- orF-level can be followed by a step of implementation at only the I- orF-level for similar reasons.

[0744] Recall that the procedure of the recursive invocation of thetechnique of 2-stage interconnection can be logged by a binary treediagram. For example, the recursive procedure leading to the 30×183-stage network 1400 can be logged by FIG. 15 in Section B. Meanwhile,the recursive procedure leading to the 16×16 divide-and-conquer network5100 is logged by the 4-leaf balanced tree 5010 shown in FIG. 50A. Asstated in Section B, when each leaf in a binary tree is associated witha prescribed network, then the tree is “associated with” or“corresponding to” a recursive 2-stage interconnection network with theprescribed networks being the “building blocks” in the construction.Each internal node of the tree corresponds to a particular step of2-stage interconnection in the associated recursive 2-stageconstruction. When a binary tree is applied to the physicalimplementation of the recursive 2-stage construction, suchcorrespondence can be summarized as follows:

[0745] (a) Each leaf of the tree corresponds to a switch that is abuilding block of the overall construction and cannot be implemented inany of the aforementioned levels. Such a switching device can be aprimitive switching circuitry as stated above, an existing switchingchipset, or an existing switch on a PCB, etc.

[0746] (b) Internal nodes in the binary tree correspond one-to-one tosteps of 2-stage interconnection in the associated recursive 2-stageconstruction. Thus the step corresponding to each internal node can beimplemented at a particular one of the aforementioned five levels. Inshort, an internal node is said to be corresponding to a particularlevel if the internal node corresponds to a step of recursiveconstruction wherein the step can be implemented at that level.

[0747] One point should be noted here. The father-son relationship amonginternal nodes in a binary tree suggests a precedence ordering among thesteps of 2-stage interconnection: when an internal node is the fathernode of an other, the step corresponding to the son node must beexecuted before the step corresponding to the father node. Thisprecedence ordering must be consistent with the aforementioned orderingof precedence relationship among the five levels in the physicalimplementation of a switch based upon a recursive 2-stage construction.For example, if the step of 2-stage interconnection corresponding to aninternal node is implemented on a PCB, then the step corresponding toits father node can also be implemented on the same PCB but cannot beinside a chip.

[0748]FIG. 75A depicts an example of mapping each internal node of abinary tree 20010 to one of the levels of physical implementation, whereeach of the internal nodes 20011, 20012, 20013, 20014, and 20015corresponds to a 2-stage interconnection implemented at C-, P-, O-, I-,and F-level, respectively. In this mapping the father-son relationshipis consistent with the precedence relationship among the five levels inthe physical implementation.

[0749] The same tree appears in FIG. 75B with exemplifying dimensions ofthe building block corresponding to each leaf and also of the networkconstructed at each step of 2-stage interconnection corresponding toeach internal node. The whole construction yields a 4096K×4096Kswitching network; the dimensions of the switching network would befurther enlarged when the method of statistical line grouping isapplied.

[0750]FIG. 75C shows an exemplifying list of generic components in thephysical structure of this 4096K×4096K switching network 20061. Thegeneric components include Chip-1 20051, Chip-2 20052, Chip-3 20053,Chip-4 20054, PCB-1 20055, PCB-2 20056, PCB-3 20057, an orthogonalpackage 20058, an interface-board package 20059, and a crossbar switch20060. The IC chip 20052, PCB 20056 and the crossbar switch 20060 arebuilding blocks, each corresponding to one or more leaves in the binarytree. Chips are organized into PCBs. The generic PCB-1 20055 implementsthe recursive 2-stage interconnection network associated with thesub-tree rooted at the internal node 20071. The generic PCB-3 20057implements the recursive 2-stage interconnection network associated withthe sub-tree rooted at the internal node 20072. PCBs are interconnectedinto orthogonal packages. The generic orthogonal package 20058implements the recursive 2-stage interconnection network associated withthe sub-tree rooted at the internal node 20073. Then the PCB-1 20055 andthe orthogonal package 20058 are interconnected into the interface-boardpackage 20059. Finally, the 4096K×4096K fiber-array package 20061implements the recursive 2-stage interconnection network associated withthe whole binary tree.

[0751] Although the present invention have been shown and described indetail herein, those skilled in the art can readily devise many othervaried embodiments that still incorporate these teachings. Thus, theprevious description merely illustrates the principles of the invention.It will thus be appreciated that those with ordinary skill in the artwill be able to devise various arrangements which, although notexplicitly described or shown herein, embody principles of the inventionand are included within its spirit and scope. Furthermore, all examplesand conditional language recited herein are principally intendedexpressly to be only for pedagogical purposes to aid the reader inunderstanding the principles of the invention and the conceptscontributed by the inventor to furthering the art, and are to beconstrued as being without limitation to such specifically recitedexamples and conditions. Moreover, all statements herein recitingprinciples, aspects, and embodiments of the invention, as well asspecific examples thereof, are intended to encompass both structural andfunctional equivalents thereof Additionally, it is intended that suchequivalents include both currently known equivalents as well asequivalents developed in the future, that is, any elements developedthat perform the function, regardless of structure.

[0752] In addition, it will be appreciated by those with ordinary skillin the art that the block diagrams herein represent conceptual views ofillustrative circuitry embodying the principles of the invention.

What is claimed is:
 1. An m-to-n multicast concentrator for routinginput signals, each of the input signals being 0-bound, 1-bound, bicast,or idle, the concentrator comprising m input ports to receive the inputsignals, m output ports partitioned into two groups wherein m−n of the moutput ports are grouped as a 0-output group and the remaining n outputports are grouped as a 1-output group, and means, responsive to theinput signals, for routing a maximum total number of 0-bound and bicastones of the input signals to the 0-output group and the maximum totalnumber of 1-bound and bicast ones of the input signals to the 1-outputgroup.
 2. A method for self-routing input packets in an m-to-n multicastconcentrator, the multicast concentrator having m input ports to receivethe input signals, m output ports partitioned into two groups whereinm−n of the m output ports are grouped as a 0-output group and theremaining n output ports are grouped as a 1-output group, and amulti-stage interconnection network of bicast cells, each of the inputpackets being 0-bound, 1-bound, bicast, or idle determined by a routingtag in a packet header, the method comprising configuring each of thebicast cells, in response to the two input packets arriving at said eachof the bicast cells are in a specified combination such that one of theinput packets is a bicast packet and the other is an idle packet, toproduce a copy of the bicast packet at each of the two output ports ofsaid each of the bicast cells, modifying the routing tag of the copy ofthe bicast packet produced at the output-0 group such that the routingtag indicates that the copy is a 0-bound packet, and modifying therouting tag of the copy of the bicast packet produced at the output-1group such that the routing tag indicates that the copy is a 1-boundpacket, and configuring each of the bicast cells, in response to the twoinput packets at the said each of the bicast cells wherein thecombination of the two packets is other than the specified combination,to sort the two input packets with respect to the partial order“‘0-bound’<‘idle’<‘1-bound’ and ‘0-bound’<‘bicast’<‘1-bound’”.
 3. Amethod for implementing an m-to-n multicast concentrator with referenceto the network topology of an m-to-n concentrator, the m-to-nconcentrator having m−n output ports grouped as a 0-output group and noutput ports grouped as a 1-output group and being constructed from amulti-stage interconnection network of sorting cells, the methodcomprising constructing a multi-stage interconnection network of nodeshaving the same network topology as the multi-stage interconnectionnetwork of the m-to-n concentrator, and filling each of the nodes of theconstructed network with a bicast cell.